Machine Learning: Clustering & Retrieval

Provided by:
8/10 stars
based on  2 reviews
Provided by:
Cost FREE , Add a Verified Certificate for $79
Start Date Upcoming
Machine Learning: Clustering & Retrieval

Course Details

Cost

FREE,
Add a Verified Certificate for $79

Upcoming Schedule

  • Upcoming

Course Provider

Coursera online courses
Coursera's online classes are designed to help students achieve mastery over course material. Some of the best professors in the world - like neurobiology professor and author Peggy Mason from the University of Chicago, and computer science professor and Folding@Home director Vijay Pande - will supplement your knowledge through video lectures. They will also provide challenging assessments, interactive exercises during each lesson, and the opportunity to use a mobile app to keep up with yo...
Coursera's online classes are designed to help students achieve mastery over course material. Some of the best professors in the world - like neurobiology professor and author Peggy Mason from the University of Chicago, and computer science professor and Folding@Home director Vijay Pande - will supplement your knowledge through video lectures. They will also provide challenging assessments, interactive exercises during each lesson, and the opportunity to use a mobile app to keep up with your coursework. Coursera also partners with the US State Department to create “learning hubs” around the world. Students can get internet access, take courses, and participate in weekly in-person study groups to make learning even more collaborative. Begin your journey into the mysteries of the human brain by taking courses in neuroscience. Learn how to navigate the data infrastructures that multinational corporations use when you discover the world of data analysis. Follow one of Coursera’s “Skill Tracks”. Or try any one of its more than 560 available courses to help you achieve your academic and professional goals.

Provider Subject Specialization
Humanities
Sciences & Technology
4639 reviews

Course Description

Case Studies: Finding Similar Documents A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover? In this third case study, finding similar documents, you will examine similarity-based algorithms for retrieval. In this course, you will also examine structured representations for describing the documents in the corpus, including clustering and mixed membership models, such as latent Dirichlet allocation (LDA). You will implement expectation maximization (EM) to learn the document clusterings, and see how to scale the methods using MapReduce. Learning Outcomes: By the end of ... Case Studies: Finding Similar Documents A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover? In this third case study, finding similar documents, you will examine similarity-based algorithms for retrieval. In this course, you will also examine structured representations for describing the documents in the corpus, including clustering and mixed membership models, such as latent Dirichlet allocation (LDA). You will implement expectation maximization (EM) to learn the document clusterings, and see how to scale the methods using MapReduce. Learning Outcomes: By the end of this course, you will be able to: -Create a document retrieval system using k-nearest neighbors. -Identify various similarity metrics for text data. -Reduce computations in k-nearest neighbor search by using KD-trees. -Produce approximate nearest neighbors using locality sensitive hashing. -Compare and contrast supervised and unsupervised learning tasks. -Cluster documents by topic using k-means. -Describe how to parallelize k-means using MapReduce. -Examine probabilistic clustering approaches using mixtures models. -Fit a mixture of Gaussian model using expectation maximization (EM). -Perform mixed membership modeling using latent Dirichlet allocation (LDA). -Describe the steps of a Gibbs sampler and how to use its output to draw inferences. -Compare and contrast initialization techniques for non-convex optimization objectives. -Implement these techniques in Python.
Machine Learning: Clustering & Retrieval course image
Reviews 8/10 stars
2 Reviews for Machine Learning: Clustering & Retrieval

Ratings details

  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.

Sort By
Borys Zibrov profile image
Borys Zibrov profile image
10/10 starsCompleted
  • 9 reviews
  • 8 completed
1 year, 7 months ago
This is course #4 (out of 4) in the Machine Learning specialization from University of Washington on Coursera. Yes, there were 5 courses initially and a capstone project but the last two were removed. Seems to me like this was related to Dato (Turi) acquisition by Apple but I don't know if there was any announcement. So, if I were to rate the whole specialization I would have taken one star out for that. However, I liked this course as much as the other 3. I've already written reviews for #1 and #2, so perhaps it makes more sense to review the specialization as a whole and not just #4. So, the instructors are very cool and engaging, know what they are talking about and can explain the material reasonably well. It's not a very math heavy course, but it's not watered-down either. I would say it's just about right so one can gain intuition and understanding of what's going on and then go and read more complex books / papers. I also... This is course #4 (out of 4) in the Machine Learning specialization from University of Washington on Coursera. Yes, there were 5 courses initially and a capstone project but the last two were removed. Seems to me like this was related to Dato (Turi) acquisition by Apple but I don't know if there was any announcement. So, if I were to rate the whole specialization I would have taken one star out for that. However, I liked this course as much as the other 3. I've already written reviews for #1 and #2, so perhaps it makes more sense to review the specialization as a whole and not just #4. So, the instructors are very cool and engaging, know what they are talking about and can explain the material reasonably well. It's not a very math heavy course, but it's not watered-down either. I would say it's just about right so one can gain intuition and understanding of what's going on and then go and read more complex books / papers. I also liked that there were "usually omitted" topics covered (like LSH, mixtures of gaussians, kd-trees pruning for nearest neighbors). I encourage you to read syllabuses before you start to get excited about what you will learn. One thing I didn't like was the amount of code written in exercises. I mean, I didn't have to write hardly any code at all, and when I had it was always very clear what I should do. If I were to start the specialization today I would've used the "sklearn way" (there are very detailed instructions for sklearn) and wouldn't have used graphlab at all. In any case, exercises are always very helpful in making sure you really understand what you've learnt, so I purchased the course #4 (I finished #1-3 before Coursera switched to the new pricing model, but I would have payed for #2 and 3 and would have skipped exercises and quizzes in #1 as it's very easy introductory course and if you plan on taking further courses you will be doing similar exercises anyway). Overall, 5/5 specialization, very helpful.
Was this review helpful? Yes0
 Flag
Greg Hamel profile image
Greg Hamel profile image
8/10 starsCompleted
  • 116 reviews
  • 107 completed
2 years, 4 months ago
Machine Learning: Clustering & Retrieval is the fourth course in the University of Washington's 6-part machine learning specialization on Coursera. The 6-week course covers several popular techniques for grouping unlabeled data and retrieving items similar to items of interest. After a short intro in week 1, the course covers k-nearest neighbor search, k-means clustering, Gaussian mixture models, latent Dirichlet allocation and hierarchical clustering. It is recommended that you complete the first 3 courses in the specialization track before taking this course, but you could take it as a standalone course as long as you know a bit of Python and probability. Grading is based on a series of comprehension quizzes and labs, but you must pay for a verified certificate to gain access to graded assignments. Thankfully you can still download and complete the labs without doing the associated quizzes, so you won't miss too much as a freeware... Machine Learning: Clustering & Retrieval is the fourth course in the University of Washington's 6-part machine learning specialization on Coursera. The 6-week course covers several popular techniques for grouping unlabeled data and retrieving items similar to items of interest. After a short intro in week 1, the course covers k-nearest neighbor search, k-means clustering, Gaussian mixture models, latent Dirichlet allocation and hierarchical clustering. It is recommended that you complete the first 3 courses in the specialization track before taking this course, but you could take it as a standalone course as long as you know a bit of Python and probability. Grading is based on a series of comprehension quizzes and labs, but you must pay for a verified certificate to gain access to graded assignments. Thankfully you can still download and complete the labs without doing the associated quizzes, so you won't miss too much as a freeware student. Clustering and Retrieval has a good balance of lecture content and labs that illustrate concepts covered in lecture. The professor is easy to understand and the lecture slides and are well done. The course generally has good pacing and devotes plenty of time to each of the main weekly topics, taking care to explain important considerations like different algorithmic approaches to each method and similarities between different techniques. It does, however, go off on a couple tangents, introducing map reduce and hidden Markov models, neither of which are covered in much detail or addressed in the labs. The labs use a data set of Wikipedia articles about famous people as an example to illustrate clustering and retrieval. Using the same data set for multiple labs is always a good idea because it lets students focus on the techniques themselves instead of having familiarizing themselves with new data. The amount of actual coding you have to do in the labs is minimal. The labs are more like interactive explorations of machine learning techniques with occasional one-line fill in the blanks than full-on coding assignments. You'll spend more time reading text, running provided code and analyzing results than writing code yourself. You can look at and answer the lab quiz questions as you go along but you can't actually submit them and get graded feedback without joining the verified track. Machine Learning: Clustering & Retrieval is a great course that covers the many most common clustering techniques with adequate depth while remaining accessible. Although the coding required is minimal, it is not an easy course: some of the concepts may take a couple watch-troughs to sink in and you may struggle with certain concepts if you don't have prior knowledge of probability. Aside from the need to pay to gain access to graded quizzes and few topics that felt tacked on, there's not much to dislike about this course. I give Machine Learning: Clustering & Retrieval 4.5 out of 5 stars: Great.
Was this review helpful? Yes0
 Flag

Rating Details


  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.