# Profile

## Greg Hamel

Student

**115**reviews**106**completed

Business Metrics for Data-Driven Companies is the first course in the “Excel to MySQL: Analytic Techniques for Business Specialization” offered by Duke University through Coursera. This short 4-week, self-paced course introduces the concept of business metrics and the role they play in business analytics. It also spends some time discussing the various data-centric roles at different types of companies. The course has no prerequisites and grading is based on 3 multiple-choice quizzes and a final case-study assignment.
The lecture content in Business Metrics is crisp and the lecturer is easy to understand. There are only 3 short weeks of lecture content as the final week is devoted to the case study. The peer-graded case study assignment involves identifying and explaining a business metric in a fictitious business. The course explains several common business metrics in detail but doesn’t send as much time on how to use metrics to formulate questions, inform analysis or make decisions. Hopefully these are topics that will be covered in more detail in some of the upcoming courses in the specialization
Business Metrics for Data-Driven Companies is a good overview of business metrics and business data culture. As the first part of a larger specialization, it concludes before you get to use the metrics you learn about in any sort of analysis. The value of this course will ultimately depend upon whether the follow up courses make good use of the foundation it lays.
As a standalone course, I give Business Metrics for Data-Driven Companies 3.5 out of 5 stars: Good.

Text Mining and Analytics is the fourth course in the Data Mining specialization offered by the University of Illinois at Urbana-Champagne through Coursera. Text Mining builds upon the second course in the specialization, Text Retrieval and Search Engines. Course topics include mining word relations, topic discovery, text clustering, text categorization and sentiment analysis. The course lists programming proficiency (especially in C++) and knowledge of probability and statistics. Keeping with the system established by other data mining specialization track courses, grading is based entirely upon 4 multiple choice quizzes with 10 questions apiece. You only get one attempt at the quizzes.
Text Mining and Analytics is information-packed. Each week has 2.5 to 4 hours of lecture content in video segments that generally range from 10 to 20 minutes. The videos quality is satisfactory but the explanations and content on the slides could be a bit clearer. Despite the long videos, there are no comprehension questions or exercises to interact with during or after lecture segments to reinforce learning. By the time you reach the quiz at the end of the unit, you may find yourself having to go back review certain videos to answer the questions. There is an optional programming assignment.
Text Mining and Analytics covers many useful data mining topics, but it has too much lackluster video content for its own good. I can’t help but feel like a better course would have been able to condense the videos down to cover the same topics more clearly in half the time, leaving room for more quizzes and exercises. This course could serve as useful as reference material but students watching straight through may find a lot of information going in one ear and out the other.
I give Text Mining and Analytics 2.5 out of 5 stars: Mediocre.

Cluster Analysis in Data Mining is third course in Coursera's new data mining specialization offered by the University of Illinois Urbana-Champaign. The course is a 4-week overview of data clustering: unsupervised learning methods that attempt to group data into clusters of related or similar observations. The course covers two most common clustering methods--K means and hierarchical clustering--as well as more than a dozen other clustering algorithms. Grading is based on 4 weekly quizzes with 3 attempts each.
Cluster Analysis is taught by Professor Jiawei Han who was the instructor for the first course in the data mining specialization: Pattern Discovery in Data Mining. The quality of the slides, instruction and organization of materials in this course is slightly better than the pattern discovery course, but that isn't saying much: it is still below Coursera's usual high standards. The course rushes from one topic to another with instruction that is mediocre at best downright confusing at its worst. That's not to say you can't learn anything from this course, but the instruction is often more of a hindrance than a help. There are occasional in-lecture quizzes, but the graded quizzes largely fail to foster any understanding of the material. An optional programming assignment was added half way through the course; in a course about data mining, programming assignments should be front and center, not added as an afterthought to quell an outcry from students.
Cluster Analysis in Data Mining is another disappointing entry in Coursera's data mining specialization. Although the course covers many different clustering methods, poor instruction makes it hard to gain a good understanding of the material unless you are extremely attentive or watch the videos several times.
I give Cluster Analysis in Data Mining 2 out of 5 stars: Poor.

This class claims no prerequisites other than an intro-level knowledge of CS,
but you'll have a much easier time if you come into it knowing certain things
like the Linux command line and emacs. The entire course is structured around
using a specific set of tools (Amazon Web Services, linux, emacs, node.js
heroku, GitHub) and following along with very long sets of instructions to get
a basic crowdsourcing site online using those tools. In the early goings, the
course can be pretty frustrating if you don't already have some familiarity
with linux, screen, Git and emacs. Otherwise, you'll have to spend a
significant amount of effort learning simple keyboard commands to get
anywhere. Once you learn those basic nut and bolts, the rest of the course is
mostly a matter of following a series of simple steps that are mostly spoon
fed to you.
Startup Engineering is probably a bit overly ambitious and tries
to introduce too many different topics all at once, which can leave students
scratching their heads and following along with instructions to complete
homeworks without necessarily knowing what they are actually doing. I think
the main issue people are having is that the course plays out more like an
extended tutorial or series of resources than an actual course. The video
lectures are extremely brief and not particularly insightful: all they do is
go through the lecture PDFs which are 30-50 page, detailed info dumps on
various topics related to the tech stack they are using and business start-
ups. They are a useful resources, but they take an enormous amount of time to
go through in detail, since they have links to other resources that themselves
take hours and hours to look through. It also seems that the main lecturer is
likely too busy to make comprehensive video lectures, since they are so short
and usually late.
In short, there is a ton of useful material here for a
motivated self-learner to sift through, but it lacks focus and engaging video
lectures.
I give this course 2.5 out of 5 stars: below average.

Linear and Integer Programming is a 7-week course covering linear programming
in detail. The course focuses on teaching the simplex method for optimizing
systems linear equations with constraints for the first 4 weeks and then
covers integer programming and applications. You should be comfortable with
basic linear algebra and calculus before taking this course. The course
includes optional programming assignments that allow students to build up
their own simplex algorithms over the course of the class, but you can easily
pass the course just taking the weekly quizzes. Linear and Integer Programming
does an admirable job tacking a dense, dry subject. The instructors are easy
to understand and explain confusing concepts well. The presentation style and
video quality seem a bit dated, but it doesn't detract much from the learning
experience. I must admit that my interest waned as the course went on because
I took it due to curiosity than rather than a preexisting interest in the
subject. That was a mistake. You should not take this course for fun; take it
if you really want to learn about linear programming and have the time to get
through all the lectures, supplementary materials and programming assignments.
Overall, Linear and Integer Programming is a great course if you want to learn
about the simplex algorithm in depth and understand important considerations
and applications of linear and integer programming.

UT.7.01x: Foundations of Data Analysis is a gentle, 13 week introduction to
statistics and the R programming language. The course covers basic
descriptive statistics, the normal distribution, sampling and hypothesis
testing, including t-tests, chi-square tests and ANOVA. The course has no
prerequisites, although you may need to spend some extra time learning the
basics of R if you haven't used it before.
Each week of Foundations of Data Analysis begins with a reading assignment, a
couple of lecture videos with comprehension questions and an R programming
tutorial. The videos tend to be in the 7-10 minute range and the tutorials
typically total less than 10 minutes a week, so the total video content per
week is usually 20-30 minutes. The videos are generally well-edited and the
professor does a good job describing concepts simply and concisely. Each week
has a prelab, lab and problem set that allow you to apply the concepts you
learn in lecture and in the R tutorials. Each problem set consists of 3-4 mini
case studies, so you'll probably end up spend most of your time on the labs
and problem sets. The assignments are not very difficult, although many
questions limit you to 1 or 2 attempts. You need a cumulative score of 70% to
earn a certificate.
Foundations of Data Analysis introduces new concepts at a relatively slow pace
and gives students a good amount of practice through the labs and assignments.
Concepts are explained well in lecture so the readings are not always
necessary to do the activities, but they often provide extra depth and raise
considerations that are not discussed lecture. The course did have some
hiccups with homework questions and auto-graders and many answers expect
rounded answers, which can result in frustrating off-by-a-fraction errors. In
addition, the course uses an external forum system called Piazza instead of
the normal edX forums, which I found to a hassle.
Bottom line: UT.7.01x is a great place for a beginner to start with stats and
R as long as you don't mind an external forum.

Supervised learning is the first of a 3 course machine learning series offered
by Georgia Tech through Udacity. This course is part of an online masters in
CS track, so it assumes students have significant amount of coursework in math
and CS. To get the most out of this course you should have at least a basic
understanding of statistics, probability and linear algebra. You should also
be have taken an algorithms course that covers big O notation and the P vs NP
question. Some prior exposure to machine learning and neural networks will be
helpful. This is a difficult course to rate because it does a few things very
well and some things, not so well. On the plus side, the professors are very
knowledgeable, have great chemistry and make a lot of jokes, some of which are
quite humorous. I don’t think I’ve ever chuckled so much taking a MOOC before.
The course provides a nice overview of some key topics in supervised learning,
such as decision trees, linear regression, neural networks, k-nearest
neighbors, ensemble learning and Bayesian networks. Content is a mixture of
high level discussion, pseudocode for learning algorithms and math behind the
algorithms. The difficulty level can vary quite a bit from one section to the
next depending on your background knowledge. On the down side, this course
does not have any homework or programming assignments (at least if you take
the freeware course) so you don’t get any experience actually implementing the
algorithms presented. The course teaches material by having one professor
present topics to the other in each section. This method works OK, but since a
professor acts as the student, it can move a bit fast since the “student”
grasps all the material much quicker than a real student would. All in all
this is a good course that covers a lot of ground in a relatively short amount
of time—probably too short. It could benefit from slowing down a little bit
and providing at least a few short homework sets and/or programming
assignments.

Udacity's Intro to Machine Learning is an introduction to data analysis using
Python and the sklearn package. The course consists of 15 lessons covering a
wide range of machine learning topics including classification algorithms
(Naive Bayes, decision trees and SVMs), linear regression, clustering,
selecting and transforming features and validation. As a self-paced course,
you can take however long you wish on each lesson; some take less than an
hour, while others can take several hours depending on how long you work on
the mini projects. Intro to Machine Learning requires basic programming and
math skills.
Each lesson consists of a series of video segments and quizzes introducing a
new topic followed by a mini-project that gives you a chance to work with code
dealing with the topic at hand. Katie does most of the teaching and her
enthusiasm helps keep the course engaging. The quizzes can, at times, seem
patronizingly simple. The mini projects are a bit harder and contribute more
to learning, although they occasionally lack adequate guidance and feedback to
help students arrive at the expected output. The final project and many of the
mini-projects leading up to it, involve detecting persons of interest in the
Enron scandal using a data set of emails sent by Enron employees. Interesting
real-world data sets are always a plus.
Intro to Machine Learning is an accessible first course in machine learning
that prioritizes breadth, high level understanding and practical tools over
depth and theory. You won't be an expert in any of the topics covered in this
course by the time you're done, but you'll have a good foundation to build
upon. If you are interested taking a similar course with many interesting
mini projects that uses the R programming language, try MIT's Analytics Edge
on edX. Coursera's Machine Learning with Andrew Ng is a logical next step to
dig deeper into machine learning algorithm design and implementation, while
Caltech's Learning from Data on edX is a great course if you are interested in
machine learning theory.

Explore statistics with R is a 5-week introductory level course covering the
basics of R, statistics and using R for statistical analysis. The course
covers 3 main topic areas in 4 weeks--R basics, getting and manipulating data
in R and statistical tests in R. The 5th week consists of a final graded
assignment where you follow along with research project conducted in R. Each
week consists of a few short lecture videos followed by a series of graded
quiz questions. The class awards a certificate if you achieve a total score at
last 60% on the quizzes and the final graded assignment.
Explore stats in R offers some quality content, but it is too short constitute
a complete intro to R or stats. The professor speaks clearly, explains
concepts well and seems genuinely excited to be teaching the course. I noticed
he seemed active on the course's discussion boards, which is nice to see. The
quizzes were too few and a bit too easy: they generally tested conceptual
knowledge and did not require the student to do much in R besides copy, paste
and run code. As a course focused on statistical operations, it didn't teach
basic programming concepts in R like control flow and functions. This course
could benefit from having a bit more content each week and beefing up the
homework exercises to force students to do a little bit more in R on their
own. Extending the course by a couple of weeks would also give the professor
time to cover some neglected topics like programming basics in R and more on
data visualization. With expanded content this could be a great course, so
hopefully they'll make some tweaks and additions and offer it again.

Udacity's Data Visualization and D3.js is one of two new intermediate data
science courses Udacity released this month (Nov 2014), the other being an
introduction machine learning. This course consists of 4 lessons covering
visualization and D3.js basics, design principles and dimple.js, narrative
structures and interaction/animation. Each lesson spends some time discussing
general visualization design principles and considerations followed by
technical information such as combing through D3.js code. Since D3.js is a
Javascript library, it is useful to have some exposure to Javascript, HTML and
CSS before taking this course.
Data Visualization with D3.js has the same polished and streamlined content
structure as Udacity's other courses, with each lesson taking the form of a
series of short videos interspersed with quizzes. The content focused on
visualization design and principles is well done. On the other hand, the meat
of the course--the sections focused on coding and creating visualizations--
were not as engaging as I'd hoped. D3.js is a low level Javascript library, so
it takes a lot of code to generate graphs and a lot of time to explain the
code and learn what it is doing. Too many videos consist of talking students
through large chunks of somewhat cryptic code without much interactivity and
it takes too long to get to the point where you make visualizations. I didn't
feel like I was really learning how to make visualization myself so much as
understanding bits and pieces of the instructor's code. The course doesn't
give students enough opportunity and direction for writing D3.js code
themselves: lessons 3 and 4 don't have problem sets. I think it was probably a
mistake to use D3.js for such a short course. It might have been better to use
a higher level visualization package that gets students making their own
visualizations faster with less code.
Data Visualization with D3.js is not a bad course and I could see other
student liking more than I did, but after taking Udacity's excellent
Exploratory Data Analysis course, it was a disappointment. In the EDA course,
you jump in and start generating tons of plots in R and actually get to the
point where you are reasonably comfortable using ggplot2 to make plots by the
end. If are looking to learn D3.js specifically, this course could be a great
starting point, but for learning data visualization in general, D3.js seemed
to be more of a barrier to learning than an asset. I'd liken this course to an
introductory programming course that uses C. Starting with a lower level
language like C can be a bit painful and it takes longer to get to the point
where you are doing interesting things--time you don't have in a 4 lesson
course.

MIT's Introduction to Game Design on edX is a course about crafting games with
a focus on video games. I don't normally write course reviews until all the
course content is available, but I feel compelled to write this review early
(after the 3rd week of 6) because the course is diverging from my expectations
so I may not finish it.
I love to play board games and have designed a few board/card games in the
past that never really got off the ground. I signed up for this course with
the hope that it might help me improve a game I'm working on. The course
description and promo video suggest that it would mainly focus on general game
mechanics, prototyping and play testing, with little to no mention of video
games outside of some digital prototyping. The first 3 weeks give some
attention to both board games and video games, but the course is focusing more
and more on video games with each successive week. The majority of guest
speaker time and homework project time is devoted to digital games. The final
3 weeks cover digital prototyping, user interfaces and the business of games,
which are likely to be heavily skewed toward video games over board games.
Intro to game design has good content and for someone interested in making
digital games, it would probably be a great course. Its main failing is that
it didn't make it clear ahead of time that the focus of the course was going
to be digital games. As it stands now, the course tries to split time and
assignments between board and video games, which is not ideal. There are
sections that are useful for board game designers here and there but it is
getting difficult to weed through all the video-game oriented content. The
course would be better split into 2 courses, one for board/card game design
and one for digital game design.

Khan Academy's linear algebra lecture series provides a thorough introduction
to linear algebra from the basics of vectors and matrices to projections into
lower dimensional spaces, eigenvectors and eigenvalues. Khan has an impressive
amount of knowledge and manages to be engaging with his voice-overs despite no
actual face time. Khan's lecture videos all follow the same basic formula: he
writes equations and works through problems with colored pens on a black
background while talking students through everything he is doing. This format
works remarkably well and I find it much more engaging than courses that use
narrations over static slides. Seeing someone work through problems in real
time is much more helpful and interesting than showing slide after slide full
of information and equations prepared in advance. Such walls of text and
equations can be intimidating and make it hard for students to know where to
focus their attention. On the down side, real-time exercises can be time-
consuming so it's not uncommon for videos to reach the 15-20 minute range. I
went through parts this course to supplement learning in edX's Linear Algebra:
Foundations to Frontiers and often found Khan's videos and explanations more
engaging and easier to understand than the videos in the edX course.

How to use Git and GitHub is a 3-week introductory course on the basics of the
Git version control system. As a short course with only 3 lessons, it focuses
on the giving students a solid grounding in the basics of Git and doesn't
stray too far into any advanced topics. Lesson 1 covers version control in
general, checking differences between files, commits, cloning, git log and
getting Git set up on your computer. Lesson 2 covers basics of repositories,
branches, merging and merge conflicts. Lesson 3 introduces GitHub and related
commands and considerations including remotes, pushing, pulling, forking and
issues that may arise due to collaboration.
How to use Git and GitHub does exactly what a short intro level course should
do: stay focused on covering the basics in detail without taking diversions
into esoteric features that are likely to confuse students and distract them
from forming foundation knowledge. Sarah and Caroline do a good job explaining
things at a level and pace appropriate for an intro course. The course has a
bit more reading embedded in the video playlist than most Udacity courses.
Also, many of the quizzes require you to run commands on your computer and
copy and paste output back into Udacity, which can be a bit troublesome. It
would be nice if they had an interactive Git environment similar to Code
School's allowing you to do everything you need to do right in the browser.
Still, How to Git and GitHub is a great place to start if you are learning
about Git for the first time.

Intro to HTML and CSS is a 3-lesson primer on front-end web development and
design. Although the name of the course suggests you'll learn HTML and CSS
basics, the content is actually focused on higher level web development and
design concepts like web page and project structure, responsive design and web
frameworks. The course spends very little time talking about nuts and bolts
like different HTML tags and CSS properties. The content itself is well done,
it just strays a bit from what you'd expect given the title of the course. It
should be entitled "Fundamentals of front-end web development/design" or
something similar.
This course is hard to rate given that the content is good, but it doesn't
quite fulfill the expectations set by the title. As such, I'm subtracting 1
star from what is otherwise a nice, short intro to front end web development.
If you want to learn HTML nuts and bolts, the first week of Udacity's "web
development" spends a lot more time going over HTML tags. The HTML and CSS
courses on Code Academy and Code School are also good options.

Age of Globalization is board overview of globalization that covers a wide
range of topics including transportation systems, capitalism, regulation,
social justice, the labor market, pop culture and sports. The course starts
with a brief overview of the subject of globalization, world exploration,
trade and imperialism. After the first two weeks, the remaining 10 weeks of
content are released all at once, giving students the ability to skip around
and sample topics of interest. Each week of content has an overarching theme
and several sections that touch on various aspects of the theme or cite
specific cases relevant to the theme. Each section typically starts with one
or more short readings followed by a 5-20 minute lecture video, a discussion
question and a short comprehension quiz.
The content of Age of Globalization is well organized and the video and
presentation quality are good. The professor is easy to understand and no
background knowledge is needed to understand any of the content. The lecturer
does speak a bit slowly, and given the non-technical nature of the content,
you may want to turn up the video playback speed. The course has a very
"liberal arts" feel to it and some of the content may be review for the
educated viewer. Sampling the topics that interest you and skipping others may
be a better use of your time than trying to complete all the content in order.

Introduction to Data Science is misleading title for this course because it is
not introductory level and it does not have a sensible flow that builds from
one week to the next as you would expect from an intro course. Instead, the
course acts as more of a data science sampler that introduces new topics each
week that often have little to do with material covered in previous weeks.
Lecture topics include relational databases, relational algebra, SQL,
MapReduce, No SQL, miscellaneous topics in statistics, machine learning,
visualization and graph analytics. If that sounds like a disjointed
smorgasbord of topics, it is. To make matters even more complicated, the
programming assignments use three different languages: Python, R and SQL. This
course is best suited for those who have some exposure to Python, R, SQL and
statistics.
If you have the appropriate background knowledge, this course touches on many
interesting topics and while the lecturer's delivery is not great, he is quite
knowledgeable and the material usually isn't too hard to grasp. Although the
homework assignments require different languages and may take you a while to
complete, they are rewarding. For instance, you'll work with real Twitter data
you capture from the net, implement MapReduce operations in Python and
participate in a machine learning competition on Kaggle.com.
Introduction to data science is likely to be frustrating to those expecting a
general intro to data science. The course jumps around too much and uses too
many different tools to be a good first course in data science, but the
breadth of topics covered and programming assignments make this course worth a
look if you already have some exposure to data science or the tools the course
uses. If nothing else, you can skip through the lectures and watch sections
that are of particular interest to you.

I took this course on MIT OCW before it launched on the EdX platform, so the
material I worked with is probably a bit different in the most up to date
version. This course is a comprehensive introduction to computer science and
the Python programming language. Topics covered include: basic operations,
control flow, loops, functions, recursion, algorithm complexity, divide and
conquer, basic sorting algorithms, dynamic programming, the knapsack problem,
object oriented programming, simulation, random walks and Monte Carlo
simulation. The OCW course was fairly high level and spent more time talking
about computer science concepts than Python syntax, which made the programming
more difficult than it could have been had it been tailored for online
learners. The newest EdX version has better resources for learning the nuts
and bolts necessary to complete the course. This course is a lot of work and
more comprehensive and difficult than many other intro CS courses.

Introduction to Linux is a self-paced course offered by the Linux foundation
geared toward students interested in using Linux for the first time. The
course covers a wide range of topics including installing Linux, differences
between Linux distributions, the GUI, the command line, various system
operations and Bash scripting. You can use any of three different Linux
distributions--Ubuntu, CentOS and OpenSUSE--to follow along with the materiel.
The content consists mainly of static slides with occasional video tutorials,
exercises and comprehension questions; 100% of the course grade is based on an
easy 30 question final exam. The slides are quite informative and the
exercises are generally well done, but it would be nice to see more video
content. The entire course probably has around an hour to an hour and a half
of video. Overall, Introduction to Linux is an informative yet impersonal
offering that is more akin to reading an operating manual than taking an open
course with human teachers and students. The content will likely prove useful
as reference material, but the lack of human connection makes getting through
it a bit of a slog.

Intro to sabermetrics is a beginner course in baseball analytics published by
Boston University on the edX platform. The course is organized into 4
different content tracks: a statistics track, a sabremetrics track, a tech
track, and a baseball history track. The statistics track covers basic
statistical concepts like mean, median, measures of spread, regression to the
mean and correlation. The sabremetrics track introduces a variety of concepts
and computed statistics in baseball analytics like on base percentage,
slugging, other hitting metrics and converting runs to wins. The tech track
focuses on teaching SQL database queries using an interactive mySQL
environment as well R basics. Each of the course's 6 weeks of content start
with a brief overview of the material to be covered in each track. SABR101x is
a good intro to sabremetrics, but it suffers from several issues common to
first run MOOCs that held it back from being a great course. The course has
good instruction and the organization of the materials into different tracks
was nice to let people focus on areas of interest. On the down side,
information in the videos was sometimes hard to make out due small text size
and poor color choices with backgrounds and pens. The difficulty level also
seemed a bit unpredictable: the statistics track was very basic while the tech
track gets into SQL and R at a rate that is probably a bit too fast for people
with no background knowledge. In addition, tech exercises sometimes suffered
from ambiguous wording and automated graders initially expected too much
accuracy on rounded answers. Many of these kinks could be straightened out for
a second offering of the course. If you love baseball and have any interest in
baseball analytics, you will probably enjoy this course. If you're mainly
interested in analytics and picking up new technical skills, the SQL tech
sections and SQL sandbox are the highlights of the course.

This is another course that had dozens of reviews before the course started
because people were freaking out that materials weren't available at the
stroke of midnight the day the course started. The course is just wrapping up
today, so don't put too much stock into reviews made more than a few weeks
before now.
I'll start by saying I don't normally take courses that aren't focused on
technical topics like computing, math or science because I don't find many of
them to be of practical use. When I saw a course entitled "Becoming a
resilient person" I was quite skeptical because it sounded like the sort of
fluff course you'd take to pad your schedule in college to raise your GPA. I
decided to sign up for it anyway because the time commitment was minimal and I
wanted to check out the first week's material to confirm my suspicions.
Eight weeks later, I must say I'm glad I took this course. It discusses many
important life topics that affect happiness and wellbeing such as values,
goals, mindfulness, gratitude, managing emotions, making therapeutic lifestyle
choices and making meaningful social connections and takes time to give
actionable advice for making positive changes in your life. On the technical
side, the lecture videos are well organized, the instructor is always on
screen and he delivers information clearly. The materiel is very accessible,
so almost anyone could take this course. There are short, easy quizzes each
week and homework that usually involves using what you've learned in your life
or explaining concepts discussed in lecture to willing ear like a close friend
or family member.
Given the low time commitment required, this MOOC provides great value. The
forums were littered with posts about how the course helped people make
positive changes in their lives. There's no guarantee you'll see any positive
changes, but for an investment of 8-15 hours, you have a lot to gain and not
much to lose.

Practical machine learning is the 8th course in the 9-part data science
specialization offered by John Hopkins on Coursera. This course introduces
machine learning in R, including the basics of prediction, splitting data into
training and testing sets, regression, trees, random forests and boosting all
in the span of 4 weeks. The course focuses on using the Caret package in R to
apply machine learning algorithms. Similar to other courses in the data
science specialization, the course content is mainly static slides with voice-
overs, but thankfully the slides are generally not overly cluttered and the
voice-overs are of decent quality. The course has a lot of good information on
how to use R to apply common machine learning techniques to data, but you
aren't going to gain a deep understanding of how the machine learning methods
work. "Practical" in this case means "learn how to use the tool, not how it
works." I suspect students coming into this course with no prior knowledge of
machine learning will find that the lectures jump from one topic to another
too quickly as the course goes on. Taking a course that covers machine
learning theory, like the 3 part machine learning series from Udacity, will
give you a deeper understanding of the methods introduced in this course.
Practical machine learning does pretty good job introducing a machine learning
topics in a limited amount of time, but the coverage is too brief to gain a
solid understanding of many of the methods presented. This course would have
been much better if it was 8 weeks and had at least 1 hour of solid lecture
content per week with interactive exercises or homework. If you’re looking for
an excellent practical machine learning course that spends enough time on each
topic and has enough homework to really help students learn, check out MIT's
Analytics Edge on edX.

Introduction to databases is a self-paced course offered by Stanford through
Coursera that provides a thorough overview of databases focusing on relational
databases and SQL. This course was created in late 2011, so the presentation
and some of the content is slightly dated, but it is still a very good course.
The videos stand up pretty well when compared with modern MOOCs and you'd be
hard-pressed to find a more accessible intro to databases elsewhere. The main
shortcoming of this course is that only it briefly touches on topics like no
SQL systems, big data and map reduce which are areas that have advanced quite
a bit in the last few years; these topics would warrant more coverage in an
updated course. The content can get a bit slow at times, but that's pretty
much unavoidable given the subject. You might not want to go through all the
lecture videos depending on your background knowledge and goals; the content
is well-organized by subject, making it easy to stick to the topics you want
to learn.

Developing data products is the final course in the 9-part data science
specialization offered by John Hopkins on Coursera. This course introduces
several tools you can use to put R code on the web, into slideshows and into R
packages, including Shiny, rcharts, Google Vis, slidify and R studio
presenter. Although the course is listed as 4 weeks it only has 3 weeks of
lecture content, with one week devoted to giving students time to work on the
course project. Unlike previous courses in the data specialization, this
course is not taught by a single professor: each of the 3 professors involved
in the data science specialization leads a few lectures.
This course provides a decent overview of some useful tools for integrating R
with the web and in presentations, but it covers too many different tools in
too short a time without any exercises to help students practice using the
tools presented. You'll have to spend a lot of time on your own exploring the
tools discussed to really learn how to use them. It's nice to be aware of the
kinds of tools that are out there and have some basic information on each one
to get started, but in keeping with the theme of the entire data science
specialization, coverage is only skin deep.
Now that I've gone through all 9 courses in the data science specialization, I
can say that on the whole, the data science track is disappointing. On the
plus side, you will gain basic R proficiency if you complete the R
programming, getting and cleaning data, reproducible research and exploratory
data analysis courses. That said, too much of the material is poorly presented
with a lack of instructor face time and overly cluttered slides. The courses
routinely try to cover too much materiel too fast and skimp on content in the
later weeks. There are no in-lecture quizzes and few interactive exercises or
quality homework problem sets. A cynic might question John Hopkins' motivation
in offering the data science specialization: making 9 short courses that they
can rerun each month and charge $50 a pop to anyone interested in verified
certificates smells a bit like an experimental cash grab. Regardless, there
are several other MOOCs out there that cover the same topics better.

Regression Models is the 7th course in the John Hopkins data science
specialization track on Coursera. This course is essentially identical to the
statistical inference course in terms of structure, presentation and quality:
the entire course consists of dull, information-packed slides with mediocre
voice-overs. It seems like half of the course consists of slides with verbose
math expressions in summation notation and the instructor telling you don't
really need to understand them unless you are interested in the math behind
the models. As with other courses in the track, there are no in-lecture
quizzes or interactive exercises and there is no instructor face time.
Overall this is a disappointing course that probably won’t keep your interest
long enough for you to bother completing all the videos much less the quizzes
and the project.
If you’re looking for other places to learn about regression models, the last
two weeks of Duke's data analysis and statistical inference cover regression,
as do the first few weeks of MIT's Analytics Edge. I highly recommend both of
those courses. Regression Models does cover regression in a bit more detail,
but given the poor presentation you'd probably be better off reading
Wikipedia.
*Update: John Hopkins has recently released an interactive learning package for R called Swirl that provides a series of exercises for this course and some of their other Coursera offerings. The Swirl exercises for this course help reinforce the topics in a way that is much more engaging than the lectures. I give the Swirl exercises for this course a score of 3/5 stars. It would have been nice if the Swirl package was available from the beginning.

Make your own 2048 is a short, 1-sitting course that introduces HTML and CSS
using in the context of the "2048" game. 2048 its a simple, yet addictive tile
moving game; in this course you'll look at the CSS and HTML behind of a
version of 2048 and alter the tiles to make your own custom version of the
game. This is a fun little course that lets you learn by doing. Just be
careful as you may end up spending way too much time playing the game!
Note: the course is supposed to have a second section that dives into the
Javascript that drives the game, but the second section has been forthcoming
for well over a month now, so there's no telling when it will be available. As
it stands now the course only takes an hour or so to get through plus whatever
time you spend fiddling around with the game.

Machine Learning 3—Reinforcement Learning is the final part of a 3 part
machine learning course offered by Georgia Tech through Udacity. This course
is the shortest of the 3 parts, spanning only 4 lessons that cover markov
processes, reinforcement learning in general and game theory. This part is not
as quite strong as the previous two parts of the machine learning course
because it is too short and spends bit too much time covering basic game
theory. There are no homework exercises but there are a few in-lecture quizzes
here and there.
If you went through the first two parts of the machine learning course,
there's no reason not finish it off by taking this part: the professors still
have great chemistry and it doesn't take too long to complete. Just be aware
that about half of the content is an intro to game theory, so if you've taken
a course that covers game theory before, half of this course will be review.

The science of everyday thinking is a fun, light course covering how people
think, learn and make decisions. Major topics include illusions and cognitive
biases, intuition, learning, experimentation, belief and how scientific
thinking can improve decisions. The course consists of 12 modules that each
focus on a particular theme with a series of video lectures interspersed with
interviews of experts. Each section concludes with a 10-question quiz, an
invitation to participate in the discussion forums and a video showing some of
the work of on-campus students.
The course videos are well done, both in terms of content and the quality of
the video footage itself. In most MOOCs I find guest lectures to be of little
value: they are usually tacked on as bonus content that is not always directly
relevant to the main lectures. This is the first MOOC I've seen where guest
speakers fit well into the flow of the main lecture content and enhance the
overall course experience. On the downside, I did not care for the quizzes
because they ask too many questions that require you to remember specific
lines, facts or definitions from the lecture videos. I also felt that the
course tapers off a bit at the end: the final 4 weeks were not quite as
interesting as the first 8. Still, the science of everyday thinking is a very
good course that provides interesting insights into the human thought process
without a major time commitment.

I'm not sure how there are already 20 reviews for this course when it wasn't
released until today, but I just finished going through the main lectures and
I have to say this course is very disappointing. I wasn't really sure what to
expect since the course had 1 week listed as the duration: it a short one-
sitting type of course that is just one of several installments the lecturer
is planning to offer. The main lecture content raises many interesting
concepts about big data and social physics but there is not enough content and
each section concludes with the professor referencing his book and telling you
to go read it to learn more even though the book is "not required." Basically,
you’re given a bunch of interesting ideas and visualizations to whet your
appetite and then the lectures fade to a picture of the book. I'm not sure
whether this course is really trying to teach students or masquerading as a
course while functioning as infomercial for the book. It's too bad because it
is a fascinating topic and I feel that the lecturer could have made a good
course if he took the time to create a 6-8 week MOOC that covered all the
material in depth.

Linear Algebra - Foundations to Frontiers is an introductory linear algebra
course that teaches linear algebra in the context of computing. If you don't
have any familiarity with programming or python, the computing component is
going to be hard to follow. You can, however, skip all of the programming
parts and just go through the lecture videos and quizzes. Topics include
vectors, linear transformations, matrix vector operations, matrix
multiplication and inversion, vector spaces, orthogonal projection and bases
and eigenvalues and eigenvectors. LAFF requires a major time commitment.
Unless you are already familiar with some of the topics, you'll probably spend
5-8 hours a week. It is clear that a tremendous amount of effort went into
producing the materials for this course. There are multiple homework exercises
after almost every video and most weeks have one or more programming exercises
where you implement and visualize linear algebra functions using tools the
instructors have created. The instructors were also active on the forums,
which was nice to see. If I were to judge this course solely on the amount of
content and quality of exercises, it would be 5/5. That said, I didn’t find
the instructor engaging on a human level. Math can be boring; instructors that
are excited about the topics they teach can go a long way toward mitigating
the dryness. The instructor was robotic in his presentation and I often found
the lectures hard to follow. When I decided to watch some of Salman Khan’s
linear algebra videos on Khan Academy to review for the final, I found his
presentation of the same concepts more engaging and easier to understand. I
came out of this course feeling like I didn’t learn as much as I could have
because the material is not always presented in a way that is easy to follow
and my interest waned from time to time. LAFF provides everything you need to
build a solid foundation in linear algebra—if you are able to remain attentive
despite the dry presentation.

MIT’s The Analytics Edge is a course focused on using statistical tools to
gain insight about data and make predictions. The majority of the course
teaches analytic methods using the R programming language, but the final 2
weeks deal with solving optimization problems using spreadsheet software
(LibreOffice or MS Excel). The course runs 11 weeks and covers R basics,
linear regression, logistic regression, decision trees, text analytics,
clustering, visualizations and both linear and integer optimizations. The
Analytics Edge is a meaty course. It has a lot of content each week and it’s
not easy to breeze through things like it is with many other MOOCs. There are
graded quizzes after each video lecture and each week of new material has 4
fairly lengthy case studies to complete. One week is devoted to an analytics
competition while the final week is reserved for a 4 part final exam. Some
students on the forums claimed they were spending 10 to 15 hours a week on
this course. Coming into the course with basic knowledge of statistics and R
helps a lot. It should be noted, however, that this course is not too math
intensive. It doesn't spend a lot of time talking about formulas or nitty-
gritty mathematical details; it mostly teaches you how to apply statistical
functions and methods and interpret the results. Although this course requires
a serious time commitment, it is time well spent. The Analytics edge is an
excellent course that teaches a bunch of practical statistical tools and
actually gives you enough practice using them through the lengthy homework
exercises to gain some confidence with them and remember how to use them. Too
many courses info dump syntax and concepts, but don’t back them up with
practical problems to let you use what you've learned. The homework problems
for this course are very well crafted and look at a variety of interesting
data sets from basketball stats to tweets about Apple. I can’t even imagine
the amount of time that went into putting all the homework exercises together;
kudos to the team at MIT for their hard work. If you’re interested in learning
some practical analytic methods that don’t require a ton of math background to
understand, this is the course for you.

Reproducible research is the 5th course in the John Hopkins data science track
on Coursera. As the title states, this course is all about making research and
data analysis reproducible using the R programming language. The first 2.5
weeks of lecture material in this course is great. It provides a well-
organized overview of how to create reproducible research in R using R
markdown and the knitr package, taking plenty of time to talk about best
practices. Thankfully, Roger Peng has added in a little box with his face in
at as he talks over his slides for many of his videos, which makes the content
a lot more engaging than it is in some of the other John Hopkins courses that
only have voiceovers. The final 1.5 weeks of lecture video material is not as
useful or engaging and seems a bit lazy in that week 4 takes the form of
recordings of lectures given sometime in the past. The videos in second half
of week 3 only have voiceovers and they have an echo to them that makes them
hard to listen to. All in all, the first 2.5 weeks of this course are
definitely worth checking out if you have any interest in learning about
reproducible research but you might want to skip through some of the content
at the end of the course.

Intro to Object Oriented Programming is a short overview of object oriented
programming using the Python programming language. This is a beginner level
course but it assumes you have a basic grasp of programming in python. It
would be a good course to take after completing the first few weeks of an
introductory python course like Udacity's CS 101, Rice's "An Introduction to
Interactive Programming in Python" on Coursera or MIT's "Introduction to
Computer Science and Programming Using Python" on EdX. Intro to OOP provides a
gentle introduction to using classes in python that starts by building up your
confidence with creating programs with simple, yet interesting examples like
drawing lines, sending text messages and filtering messages for profanity. The
instructor uses built-in python class objects to introduce the concept of
classes before having students create their own classes. In the final section,
you'll use classes to make a basic movie website that plays trailers for your
favorite movies. The course touches briefly on some advanced topics in object
oriented programming like inheritance and method overriding. This course is
very well organized and the instructor explains OOP concepts in a way that
makes them easy to understand. You'll also learn about the structure of python
programs so that you better understand where functions and classes reside in
python and its modules. The instructor frequently refers to the python docs,
stackoverflow and Google to figure out how to do new things, which are good
skills to learn. Overall this is a great little course that could take
anywhere from 5 to 15+ hours depending on your experience level and how much
time you want to spend working on projects.

This class provides an overview of Salesforce, a software platform that lets
you create applications in the cloud. Salesforce lets different groups of
people like managers, employees, contractors and customers interact through
web applications and gives the user a variety of tools for aggregating,
crunching and displaying data. The course is presented in a novel format:
Udacity's representative Andy assumes the role of the student and learns all
the material at the same time as online learners. A representative from
Salesforce, Samatha, teaches Andy step by step, which makes it easy for an
online learner to follow a long and do everything that Andy is doing. I found
that the format worked quite well. My main gripe with the course is that they
could do a better job explaining how you would actually deploy and use a
Salesforce App in real life. It would be nice if the course was a bit longer
or if there was a follow up class that goes into more detail about using
Salesforce and advanced features. I feel like they breezed through all the
basics and just when things starting getting really interesting, the class
ended.

I signed up for this course just to see what it was like, not expecting to
actually complete it. The only reason I completed it that there's only about
30 minutes to an hour of content per week. While the course does introduce
some interesting combinatorial games and concepts, the content is thin and the
professor is not always easy to follow. It's tempting to give the course a
higher rating simply because I find the subject enjoyable and the professor is
amiable, but that doesn't make up for the lack of content and lackluster
presentation.

Exploratory Data Analysis is the 4th course in John Hopkins’s data science
specialization track. I'm writing this review after completing all the
lectures and quizzes; I'm not planning to complete the projects. The first 2
weeks of this course provide a thorough overview of plotting in R using the
base graphical package, the lattice package and the ggplot2 package. Week 3
takes a sudden detour into data clustering and the fairly advanced topics of
principal components analysis and single value decomposition only jump back to
plotting with a section on color. The clustering section seems a little about
of place since there is not any introduction explaining the purpose of
clustering. What's worse the SVD and PCA sections require a fairly high level
of linear algebra knowledge to understand, which are not prerequisites for
this course. I suspect that section will leave may students scratching their
heads. Week 4 consists of 2 case studies where the professor shows you how to
perform an exploratory analysis on a couple different data sets. If this
course only consisted of the plotting lectures I’d give it a 4 out of 5. The
plotting lectures that make up the bulk of the course are well done and this
course provides more instructor face time and live examples in R than any of
the 3 courses in the first wave of the data science track. Unfortunately,
there are no interactive exercises or in-lecture quizzes and the principal
components analysis and single value decomposition sections are too advanced
for this course. It would have been better if they left the SVD and PCA
functions as black boxes in R and simply explained in general terms what they
do and how to interpret their output. Still, the quality overview of R
plotting makes this course worth a look.

This course is a perfect example that having smart instructors who are
passionate about what they are doing is not enough to make for good
instruction or a good class. Udacity's course offerings are generally top
notch in quality, but this one seems to be the lemon of the lot. The course is
structured around an HTML5 game that the profs created and quizzes are
centered around having you fill in bits of code into a skeleton of hundreds of
lines of their game code. The video lectures are too brief and don't discuss
commands at a pace that allows students to learn what they are doing before
taking quizzes expecting them to use those commands. Using an already-made
game is a poor instruction decision. Building something from the ground up,
piece by piece, over the course of a class is a much better system for
learning that doesn't confuse students with tons of lines of unfamiliar code.
The profs seem to assume that students should know much more than they
actually would having watched the video lectures. Picture a bunch of
scientists who are so wrapped up in their own world that they are unable to
explain things in terms that a novice can understand. I love Ucadity, but this
is one to avoid.

R Programming is a remake of Computing for Data Analysis, another course
offered on Coursera by the same instructor. This course covers R basics such
as R data types and objects, reading and writing data, control flow,
functions, scoping, dates, loops, debugging tools, simulation and code
profiling. The slides and lectures are a bit smoother than Computing for Data
Analysis but the content is mostly the same. This course has good information
but suffers from a lack of instructor face time and heavy use of static slides
with voiceovers, which are less engaging than videos of instructors actually
running the commands they are talking about. Additionally, there are no in-
lecture quizzes or interactive exercises to help you absorb the material as
you go along. If you want to get as much out of the course as you can, I
recommend that you follow along with R Studio open on a second screen or
window and try out commands discussed as you watch the videos. Overall, this
is a decent intro to R, but it is not particularly engaging. Try R from Code
School is a much more engaging, albeit brief, intro. If you take this course
and want to apply what you've learned or want to learn R somewhere else
consider MIT Analytics Edge on EdX, Duke’s Data Analysis and Statistical
Inference on Coursera and Exploratory Data Analysis on Udacity. Each of these
courses teach R basics in the context of learning other things like predictive
modeling, statistics and data analysis.

Duke’s Data Analysis and Statistical Inference is an introduction to
statistics with an optional computational component using the R programming
language. The course runs about 8 weeks and covers a considerable amount of
ground in that time. It starts with the basics of data and data collection
methods but quickly moves on to cover probability, the normal distribution,
the binomial distribution, hypothesis testing, confidence intervals, Z and T
statistics, ANOVA and Chi squared tests and linear regression. The course is a
bit of a whirlwind tour that packs a lot into each lecture. The PDF slides
that go along with the videos are a great resource to review the information
dumped in each lecture. Many students complained that the course requires more
time than the original estimated amount of around 6-8 hours per week. The
course was later updated with an estimate of 8-10 hours per week, which is on
the conservative side. If you come in with some prior knowledge of stats and R
you can get through in 3-5 hours per week. The professor is engaging and does
a good job going through the material while providing adequate face time. The
slides are very informative and the video quality is excellent. There are
periodic in-lecture quizzes that help test your understanding of the material
as you go along. I felt that the frequency of in-lecture quizzes was just
about right in this course. Grading is based on performance on weekly quizzes
one midterm and one final exam. You need a cumulative grade of 80 percent or
more to get a certificate and you only have 1 attempt on the exams, so it is a
bit harder to earn a certificate in this course than it is for most MOOCs. If
you choose to go the computational route, a portion of your grade is based on
8 programming labs using the R programming language. You can do the labs on
your own or use a convenient web-based programming environment provided by the
instructor. The labs provide a basic introduction to R and each one explores
some of the concepts introduced in the lectures. The labs take about 30
minutes to an hour and a half depending on your level of experience with
programming and R. In the computational track you’ll also complete a final
project involving a statistical analysis of two variables, either from a data
set provided by the instructor or a data set you find on your own. The project
lets you use the concepts you’ve learned both in class and in lecture on your
own. I suspect the project is a bit intimidating to those who are new to R
because it involves more computation than the labs and you don’t have the
training wheels that the labs provide. The project grade is based on the
median score of 3 or more peer assessments. This is a great course for anyone
looking to learn statistics that moves fast enough not to bore those who know
a bit of statistics coming into the course.

Introduction to Mathematical Thinking is a great course that covers several
topics that are often not covered in high school math including proofs, logic,
quantifiers and beginning real analysis. The professor does a good job
engaging students with material that is quite dense, with a lot face time,
encouragement and walkthroughs of solutions and proofs. I didn't anticipate
actually completing the entire course when I signed up; I did mainly because
the professor is so good. The course also includes some interesting
supplementary material about the pros and cons of MOOCs.

This course begins with an introductory week about social goods and then goes
on to cover 4 major problems the world faces over 4 weeks: poverty, climate
change, disease, and gender inequality. The course is basically 2-3 hours of
lecture per week with some writing assignments. The class does a good job
covering some of the major issues the world faces and outlining some of the
things people can do to try to solve them.

Gamification provides an overview of using game elements in non-game contexts.
The primary purpose of gamification is to give people extra encouragement to
do something that they may not be adequately motivated to do on their own. For
instance, a business that wants to improve the overall health of its employees
to reduce health care costs might introduce a gamified system to encourage
workers to exercise. The course does a good job summarizing the basics of
gamification, such as how gamification can be useful, how gamification differs
from games, elements of gamified systems, motivation and psychology of players
and limits to gamification. It also lays out a basic design framework for
creating gamified systems, covers basic design choices, risks and possible
legal concerns. Gamification is an easy, fun course, that benefits from
quality instruction and an interesting topic. You’ll come away from this
course with a solid understanding of what gamification is, how it can be
employed and how to think about designing gamified systems. Afterward, you’ll
probably start recognizing gamification all around you. On the downside, the
material does not get particularly deep. Oftentimes the class feels like it is
providing structure to the ideas that you probably already have about games
and gamification, rather than presenting new insights. The course focuses a
lot on the organization and formalization of the knowledge and intuitions you
probably already have. This is not necessarily a bad thing, there just aren’t’
any “Aha” moments where you learn something particularly insightful that
changes your way of thinking. Your experience may vary. Another quibble I had
with the course is that 35% of the grade is based on peer assessed writing
assignments. Students seemed to stray from using the rubric and instead
assigned grades based on their own subjective opinions of whether they liked
your ideas or not. Overall, this is a good course, but if you want a
certificate you should plan to score 90% or more on all the quizzes and the
exam just to be safe.

Programming languages uses the goal of writing web browser as a platform to
teach topics related to writing programming languages. The class covers the
process of lexxing strings of HTML to transform it into sequences of tokens
and then parsing those tokens into syntax trees that can be passed to an
interpreter to display the web page represented by the HTML. Wes Weimer is a
good teacher and brings a fun attitude and some cringe-worthy jokes and
drawings to the table. He has a habit of throwing in random historical and
other educational tidbits to lectures, which can be good or bad depending on
your mood. His wit helps to mask the dryness of the material and the fact that
it may not be especially useful to you unless you plan to build a language, a
browser, a parser, etc. yourself. It is good, however, to have a basic
understanding of how computers process language and certain topics like
regular expressions and list comprehensions are very useful outside of the
context of this course.

A short intro to Hadoop and MapReduce. Similar to the design of everyday
things course, the course was so short that I did not feel motivated to do the
fairly involved final project.

I took the fall 2011 version of this course on the Harvard cs50.tv platform
before it came to EdX. More recently I signed up for the EdX version out of
interest to see what they had changed. This course is a general introduction
to computer science focused primarily on the C language. Topics covered: bits,
binary, ascii, Scratch, C, compilers, functions, types, scope, linear search,
binary search, big O notation, sorting, pointers, data structures, HTTP, HTML,
CSS, PHP, SQL, JavaScript, Ajax and APIs. Unlike more other introductory
courses, this class starts with C, a low level language, rather than a high
level language like Python. It also covers several different languages instead
of sticking with one language the entire time. The amount of content on the
new EdX course is amazing and extremely helpful. They have short videos and
tutorials summarizing all the most important topics of the class. This course
covers a lot of ground and it takes a long time to get through everything, but
it provides a comprehensive overview of computer science that gets a bit more
low level than most other courses, which is a good thing.

Computing for Data Analysis is an introduction to the R programming language
for people who already know how to program. The course description makes it
seem like the class is intended for everyone, even those who do not know how
to program at all; this course is not designed for people with zero
programming knowledge. The lectures move through material at a rate and level
of sophistication that assumes prior programming experience. You might be able
to get through this course without prior programming knowledge with a lot of
extra work, but doing so would be an inefficient use of your time. If you have
no prior experience, a true introductory class would be a better idea. The
course provides a decent overview of R, but the format is not ideal. The
lectures are generally 10-25 minutes with no interactive programming exercises
to do as you go along. It’s a good idea to follow along and do the commands he
talks about on your own so that you at least get some practice. There’s some
good material in the lectures, but they leave a lot out as well. You’ll
probably end up spending a substantial amount of time Googling about basic R
functions to complete the programming assignments.

This is a great little course on developing for the mobile web with fluid,
adaptive and responsive design. It will take around 1.5 to 3 hours to complete
depending on your experience level and how much you go back and look through
the material to answer the challenge questions. You should have basic
knowledge of CSS and HTML before taking this course. I must say I continue to
be impressed by the quality of the material on Code School. I've done courses
on just about every online learning site there is--Codeacademy, Learn Street,
Khan Academy, Udacity, Coursera, EdX, etc.--and I find Code School's materials
to be among the most accessible and polished as an overall web experience.
Everything from the videos to the slides and exercises are well done and
relevant and most importantly, they are explained and laid out in a way that
makes it very easy to understand. They also have a good hint system keeps you
from getting stuck too long. A+

The best introductory JQuery course I've seen on the web. The quality of
materials on Code School has been top notch from what I've seen so far.
Excellent, easy to understand videos and interactive challenges that flow
right out of the videos with a great hint system and very few bugs. I would
recommend that you know at least a little HTML, CSS and Javascript before
taking this course.

I believe the first running of the class was in late 2012, so the content is
still quite current. The course lasts 12 weeks and walks you through a wide
range of major topics related computer networks work from the physical layer
of sending signals on wires or through the air to network security and quality
of service. The class provides insight into how many things you likely use
every day actually work, like Ethernet, Wifi, routers, switches, hubs, virtual
private networks, content distribution network, peer to peer services, and of
course, the domain name system and the Internet itself. The lectures go into a
fair bit of technical detail about how different aspects of computer networks
function. In some cases, the extra detail is enlightening it can get a bit
tedious. Overall, the class was definitely worth taking, even though it does
not require any programming. I'd recommend this course to anyone that wants to
learn how computers networks work in more depth than you'd gain in your
everyday life as a web user.

Design of computer programs is an awesome class for a novice to intermediate
python programmer to learn some new tools and techniques. The biggest problem
with this course is that Udacity originally promoted it as being the next step
after CS 101. As a result, the initial offering of the class had a ton of
newbies who quickly got lost in the new topics that Prof Norvig introduces in
relatively rapid succession. Udacity has since recategorized the class as
“advanced.” I’m not sure I’d necessarily call it advanced, but you should
probably have more than just an intro course under your belt before attempting
it unless are willing to work hard and slowly. You will do a lot of
programming in this course, mostly in the context of creating and solving
games like scrabble, boggle and poker. Topics and techniques covered include
Python list comprehensions, generators, decorators, tuple unpacking, lambda
expressions, regular expressions, testing, profiling and optimization. If that
sounds like a lot to cover in a 7 week course, you’re right. This course is a
lot of work and covers a lot of ground, but it will also teach you a lot.

Udacity's intro CS class is one of the best CS intros on the web. I've taken
MIT 6.00, Harvard CS50, gone through Coursersa and LearnStreet intro courses
and I'd say this one is the best in terms of actually learning how to program.
The format of short instructional videos and quizzes on Udacity is the best
format for learning CS on the web, when executed well (other than
building/researching things yourself.). It should be noted that this course
focuses mainly on learning Python and not on the theory of CS. I think for an
intro CS class it’s okay to focus more on gaining confidence with the basic
nuts and bolts of a language than actually getting into the nitty-gritty of CS
itself. Intro courses offered by universities get more into CS theory, but
spend less time on teaching you how to actually program, which can make them a
bit frustrating and leave students feeling like they have to educate
themselves on the programming side of things. This course is entirely self-
contained: you don't need to go anywhere else or learn on your own to get
through it. It also doesn't take too long to complete, so it is a perfect
precursor to more theory-heavy classes that don't spend enough time on the
nuts and bolts.

Neural Networks and Deep Learning is the first course in a new deep learning specialization offered by Coursera taught by Coursera founder Andrew Ng. The 4-week course covers the basics of neural networks and how to implement them in code using Python and numpy. The course page states that it only requires basic Python programming knowledge, although any experience you have with machine learning, linear algebra and calculus will be helpful with gaining a deeper understanding of the material. You can access the quizzes and programming assignments without paying for the full course, but if you want to submit them for grading and get credit as having completed the course, you have to pay for the certificate.
Neural Networks and Deep Learning starts with a short introduction to deep learning in week 1, followed by 3 full weeks that build your understanding of neural networks by starting with logistic regression implemented with the same structure as a neural net in week 2, shallow nets in week 3 and deep nets in week 4. Key topics include computational graphs and derivatives on graphs, gradient descent, vectorizing code, neural network representations, activation functions, backpropagation and deep nets. The course touches on high level concepts and considerations to frame learning, but the majority of the content focuses on the low-level nuts and bolts of neural network structure and how to translate it into code.
Each week after the first has roughly 1-2 hours of lecture split up into 5 to 15 minute video segments. In each segment, Andrew Ng appears on screen and gives a brief overview of what the the video is going to cover and then he discusses the topic with voice-overs while writing on white slides, followed by a brief outro where he reappears and summarizes key takeaways. There is a lot of handwritten information and notation in the lectures, which means some students may find certain lectures difficult (or boring) to follow, but he explains things very well and the notation is there to help you gain a concrete understanding of the structure of neural nets and prepare you for working with them in the programming assignments. The production value of the videos is fairly low as the intros and outros seem to be recorded with a non wide screen SD camera and the vast majority of content is simply Ng writing on mostly blank slides. The production style is reminiscent of his original machine learning MOOC which was released back in 2012. Still, the logical organization of the content combined with Ng's masterful knowledge and lucid explanations means the relatively rudimentary production doesn't detract from the course's value. Weeks 1-3 also include an optional guest lectures with different "heroes of deep learning."
The programming assignments in Neural Networks and Deep Learning are very well done, providing great instructions, explanations and examples. You can access all of the assignments as a freeware student, so even though the course won't be listed as completed when you finish, you can still work through them and learn all the same things as paying students. The assignments are heavily structured, giving students complete code skeletons of all required functions and only requiring students to implement specific key lines of code which are described in detail. In other words, most of the difficulty in implementing neural nets--such as the logic and structure of the code and aligning matrix dimensions--is taken care of for you so you don't need to be a strong programmer to complete the assignments. This keeps the assignments moving along at a nice pace and should help keep students from getting stuck for too long and while you may struggle to implement neural nets from scratch yourself after completing this course, it shows you the tools you would need to do it. And perhaps more importantly, it gives you insight into how neural nets are working under the hood, which is good to know even if you end up using a package to build them.
Neural Networks and Deep Learning is the best introductory course on neural networks on any of the main MOOC platforms that is accessible to about as broad a group of students as possible given the nature of the material. The course isn't perfect: notation-heavy videos can get tedious and it sometimes eschews mathematical details. It also makes a few questionable decisions such as putting a 40 minute interview of Geoffrey Hinton at the end of the first week, most of which you will not understand unless you've seen neural networks before and have familiarity with his work. That said, if you want to learn about neural networks and how to make them in code, this is the right place to start.
I give Neural Networks and Deep Learning 5 stars out of 5: Excellent.

Machine Learning: Clustering & Retrieval is the fourth course in the University of Washington's 6-part machine learning specialization on Coursera. The 6-week course covers several popular techniques for grouping unlabeled data and retrieving items similar to items of interest. After a short intro in week 1, the course covers k-nearest neighbor search, k-means clustering, Gaussian mixture models, latent Dirichlet allocation and hierarchical clustering. It is recommended that you complete the first 3 courses in the specialization track before taking this course, but you could take it as a standalone course as long as you know a bit of Python and probability. Grading is based on a series of comprehension quizzes and labs, but you must pay for a verified certificate to gain access to graded assignments. Thankfully you can still download and complete the labs without doing the associated quizzes, so you won't miss too much as a freeware student.
Clustering and Retrieval has a good balance of lecture content and labs that illustrate concepts covered in lecture. The professor is easy to understand and the lecture slides and are well done. The course generally has good pacing and devotes plenty of time to each of the main weekly topics, taking care to explain important considerations like different algorithmic approaches to each method and similarities between different techniques. It does, however, go off on a couple tangents, introducing map reduce and hidden Markov models, neither of which are covered in much detail or addressed in the labs.
The labs use a data set of Wikipedia articles about famous people as an example to illustrate clustering and retrieval. Using the same data set for multiple labs is always a good idea because it lets students focus on the techniques themselves instead of having familiarizing themselves with new data. The amount of actual coding you have to do in the labs is minimal. The labs are more like interactive explorations of machine learning techniques with occasional one-line fill in the blanks than full-on coding assignments. You'll spend more time reading text, running provided code and analyzing results than writing code yourself. You can look at and answer the lab quiz questions as you go along but you can't actually submit them and get graded feedback without joining the verified track.
Machine Learning: Clustering & Retrieval is a great course that covers the many most common clustering techniques with adequate depth while remaining accessible. Although the coding required is minimal, it is not an easy course: some of the concepts may take a couple watch-troughs to sink in and you may struggle with certain concepts if you don't have prior knowledge of probability. Aside from the need to pay to gain access to graded quizzes and few topics that felt tacked on, there's not much to dislike about this course.
I give Machine Learning: Clustering & Retrieval 4.5 out of 5 stars: Great.

Machine Learning: Classification is the third course in the 6-part machine learning specialization offered by the University of Washington on the Coursera MOOC platform. The first two weeks of the 7-week course discuss classification in general, logistic regression and controlling overfitting with regularization. Weeks 3 and 4 cover decision trees, methods to control overfitting in tree models and handling missing data. Week 5 discusses boosting as an ensemble learning method in the context of decision trees. Weeks 6 and 7 cover precision and recall as alternatives to accuracy for assessing model performance and stochastic gradient ascent to make models scalable.
The course builds on the concepts covered in Machine Learning: Regression, so it is highly recommended that you take it first. Assignments use GraphLab, a Python package that requires the 64-bit version of Python 2.7. You can technically complete the course with whatever language and tools you like, but using Python and GraphLab will make your life much easier because the assignments are designed around it. Like the previous course, basic knowledge of Python, derivatives and matrices is recommended, but course doesn't get too deep into math. Grading is based on weekly quizzes and programming assignments.
Machine Learning: Classification follows in the footsteps of the regression course, offering a good mix of high quality instructional videos and illustrative programming assignments. Carlos Guestrin takes the reigns in the course (Emily Fox, the professor for the regression course, does not make an appearance) but the presentation format and style are mostly unchanged: videos break topics down into well-organized and digestible 1 to 7 minute chunks. The slides are crisp and generally uncluttered. Some of the most complicated sections are optional, so you can skip them without it affecting your performance on the programming assignments and quizzes.
The programming assignments are provided in Jupyter notebooks--interactive text and code documents that run in your browser. They do a good job illustrating the concepts and walking you through the process of implementing machine learning algorithms. Although the course claims that you'll be implementing algorithms yourself from scratch, they provide a ton setup, support and skeleton code: you don't need to define a single function yourself. Instead, you follow along with instructions and fill in key pieces of code in the bodies of certain pre-defined functions to get things working. Essentially every line of code you need to write has a comment giving you the gist of what you are supposed to do. Some may not appreciate this degree of hand-holding, but it keeps the assignments moving along steadily and puts the focus on learning and understanding concepts rather than coding details and debugging.
My only major gripe with this course is with some of the decisions concerning which topics to cover. The course mentions random forest models briefly at the end of the section on boosting, but the topic warrants a little more detail. A single 5-8 minute video would have been enough. The course does not mention support vector machines at all. The professor stated in the forums that he may release some videos on SVMs in the future but they were not included at launch since they are more complicated than other models and do not scale well to large datasets. The section on decision trees only discusses missclassification error as a metric for splitting, failing to mention information gain or gini impurity, which are often preferred in practice. Similarly, the boosting section focuses on AdaBoost, while stochastic gradient boosting and xgboost in particular are often more successful in practice. The final week's title "scaling to huge data sets and online learning" is a little misleading because it only really covers stochastic gradient ascent and mini-batch gradient ascent.
Machine Learning: Classification is a great first course for learning about classification that benefits from good organization and illustrative programming assignments. The course, does, however eschew some important topics in favor of simplicity; including a few more optional videos covering these topics would give the course the breadth and depth advanced learners desire without harming its accessibility.
I give Machine Learning: Classification 4.5 out of 5 stars: Great.

Udacity's "Deep Learning" is a 4-lesson data science course built by Google that covers artificial neural networks. The first lesson builds up some machine learning background on classification problems, while lesson 2 discusses the basic machinery of neural networks and deep learning (neural networks with multiple layers.). Lesson 3 covers conventional networks for image recognition and lesson 4 covers recurrent networks and issues dealing with text data. This course assumes you have intermediate Python programming experience and basic knowledge of machine learning, statistics, linear algebra and calculus.
Each lesson in the course consists of a series of short video lecture segments with occasional comprehension questions and breaks to apply topics discussed in programming assignments. The video quality itself is good and the lecture quality is adequate, but the lecture segments are very brief, with most lasting around a minute or less. The sum total of the video content in the third lesson on convents is less than 15 minutes. The programming assignments, which use a popular neural network library called TensorFlow, are lacking in instruction and involve either running large chunks of provided code or working on open-ended questions. You likely won't be able to make much progress on the assignments without prior knowledge of machine learning and TensorFlow or doing a lot of extra research outside of the course materials. The programming problems also require significant computing resources; my laptop with 8GB of RAM ran out of memory when running the provided code in the first assignment.
Deep Learning is a shallow course that is akin to reading CliffsNotes instead of a textbook: you'll learn some terminology and be exposed to some interesting concepts but its abbreviated coverage is likely to confuse students who are new to neural networks while leaving more experienced students unsatisfied. This course seems like a rushed attempt to capitalize on the hottest buzzword in the hottest tech industry, which is a shame because it could have been a good course if it took the time to cover the topics in adequate detail.
I give Deep Learning 2 out of 5 stars: Disappointing.
*If you're interested in learning about the topics this course introduces in much more depth, check out the video lectures and course materials for CS231n, a deep learning course focused on image recognition offered by Stanford.

Managing Big Data with MySQL is the fourth and final course in Duke University's Excel to MySQL: Analytic Techniques for Business specialization offered through Coursera. The 5-week course focuses on teaching students how to make relational database queries. Unlike some database courses that delve into details concerning database construction and theory, this course is all about the practical use of databases from the perspective of a business analyst. The first week introduces the concept of relational databases, entity relationship diagrams and schema, while the remainder of the course covers querying from simple select statements to summary functions, grouping, joins and subqueries. You don't need any particular background to take this course and it could be taken in isolation from the rest of the specialization. Grading is based on 4 week-end multiple-choice quizzes.
Weekly course content is divided into several lessons that typically involve watching a short video segment and then working through an exercise set in MySQL or Teradata, two relational databases used in the course. The lecture content is high quality but after the first week, you'll be spending most of your time working on exercises rather than watching videos. In fact, some lessons don't have video lectures at all: the written exercises are really the core of the course. The MySQL exercises are contained in Jupiter notebooks--interactive text and code documents--that let you read instructions and play around with code in the same place. The exercises provide plenty of opportunity to drill SQL queries and build SQL vocabulary. The answers to exercise questions are provided in PDFs (they are ungraded), which means you can skip ahead if you don't need more practice. Considering each week after the first has at least 3 exercises sets plus a quiz, each of which could take a few hours to complete in their entirety, consulting the answer keys frequently is recommended to keep things moving along at a reasonable pace.
At the end of each week after the first you'll do a final exercise set using Teradata and answer multiple choice quiz questions based on your results. You use the same real-world data set for each quiz--product information from Dillard's department stores—helping you build some familiarity with the data by the end of the course. The final week of the course doesn't cover any new material: it just contains the final quiz.
Managing Big Data with MySQL is a great course for learning practical relational database querying skills with plenty of exercises that let you interact with real-life data sets. The focus on drilling ungraded exercises combined with sparing use of lectures after the first week does, however, make the course feel impersonal. It plays out more like a collection of training materials than the sort of university-style course you may expect from Coursera.
I give Managing Big Data with MySQL 4.5 out of 5 stars: Great.

Managing Data Analysis is the third course in “Executive Data Science” specialization offered by John Hopkins University on Coursera. The one-week course discusses the process of data analysis at a high level from formulating questions to exploratory analysis, inference, modeling and communicating results. Grading is based on several short comprehension quizzes.
The lectures in Managing Data Analysis are of good quality and the instructor is generally easy to understand. The lectures do, however, use some jargon and concepts that aren’t always adequately explained. Unlike the first two courses of the specialization, which are geared toward managers, this course is more geared toward people who are actually going to be conducting data analysis. The concepts in this course are definitely important for data science managers to understand, but non-technical students may find this to be a jarring change of pace. In addition, certain parts may be confusing if you have had no prior exposure to statistics or machine learning other than the first two courses of this specialization.
Managing Data Analysis provides a useful overview of the process of data analysis, but it is taught at a level appropriate for data analysts. “The Data Analysis Process” would be a more appropriate name for this course.
I give Managing Data Analysis 3.5 out of 5 stars: Good.

A Crash Course in Data Science is a succinct, one-week overview of the field of data science produced by the same team from John Hopkins University that produced Coursera’s data science specialization. It is the first course in the “Executive Data Science” specialization, a data science track aimed at non-technical people like business managers. The course defines data science and then discusses different aspects of data science like statistics, machine learning and the structure, output and success metrics for data science projects. Grading is based on a handful of short multiple-choice comprehension quizzes.
A Crash Course in Data Science is good for what it is: a brief overview of a field taught at a high level so that anyone can follow along. The professors have plenty of face time, explain concepts well and the video quality is good. The content quality is a definite step up from the original John Hopkins data science track.
The only real knock against this course is its brevity and the fact that it costs the full $49 to get a verified certificate if you want to complete the specialization. A course that you can complete in an hour or two should not cost the same as a month-long course. Students looking to dig their teeth into something substantial for the first month of the Executive Data Science specialization may be disappointed.
A Crash Course in Data Science is a well-made primer on the data science field, but its brevity may leave paying students wanting.
For freeware students I give this course 4 out of 5 stars: Very Good.

Data Science in Real Life is the fourth and final course in the “Executive Data Science” specialization offered by John Hopkins University on Coursera. The one-week course examines various steps in the data analysis process and contrasts ideal outcomes against the outcomes you are likely to experience in reality. Grading is based upon a few short multiple-choice quizzes.
The lecture videos are crisp and the professor does a good job explaining the topics without being overly technical. It does discuss some topics that you won’t fully appreciate without having hand-on experience doing data science projects, but it will help prepare you for some of the problems you might encounter. Like other courses in the Executive Data Science track, there’s not too much to dislike about this course other than its brevity and the limited depth at which topics can be covered in a one-week course.
Data Science in Real Life is nice, succinct overview of many of the challenges you are likely to face in data projects and suggestions for overcoming them. It is raises considerations that could be useful for both data analysts and managers.
I give Data Science in Real Life 4 out of 5 stars: Very Good.

Building a Data Science Team is the second course in “Executive Data Science” specialization offered by John Hopkins University on Coursera. It is a one-week course that defines the different data science roles in an organization, what to look for in data scientists and strategies for managing and communicating with data scientists. The course has no prerequisites and grading is based on a handful of multiple-choice quizzes.
The content in Building a Data Science Team is similar to the first course in the specialization: it is geared toward a non-technical people who have to manage data scientists. The video quality is good and the instructor is personable, easy to understand and knowledgeable. There’s not too much to dislike about this course apart from its brevity. All of the courses in the Executive Data Science track are only a week long, so they can be completed in one or two learning sessions. This is not necessarily a bad thing: I find it refreshing to get a high-level overview of a topic in a short course, but it may not deliver the amount of content that paying students expect.
Building a Data Science Team is a good course for what it is: a succinct primer how to assemble and manage a data science team.
I give this course 4 out of 5 stars: Very Good.

Foundations of Strategic Business Analytics is the first course in the “Strategic Business Analytics Specialization” offered by ESSEC business school on Coursera. The 4-week course covers data analysis topics including clustering, exploring relationships between variables, forecasting and communicating results. All discussion is geared toward a business context, so the focus is on producing clear, actionable insight instead of looking at low-level details. The course uses the R programming language for analysis; basic familiarity with R is assumed. Grading is based on 3 quizzes and a peer-graded assignment.
Each week consists of two main content sections: a lecture section that introduces concepts and data analysis techniques and then a recital section that teaches you how to use the methods discussed in lecture in R. The lecture videos themselves are polished with nice text graphics. The lecturer’s English takes a little time to get used to, but he speaks clearly and he does a good job framing each topic in the context of business. The programming recitals are easy to follow and let you get some hands-on experience with lecture topics right away.
Foundations of Strategic Business Analytics is a nice introduction to thinking about data analytics in a business setting, but it is too short. Follow-up courses will hopefully let you dig your teeth deeper into the material. Also note that the specialization is listed as “Advanced”, but this course is not very technical and only really requires basic R knowledge as a prerequisite.
I give Foundations of Strategic Business Analytics 3 out of 5 stars: Okay.

Machine Learning Foundations: A Case Study Approach is a 6-week introductory machine learning course offered by the University of Washington on Coursera. It is the first course in a 5-part Machine Learning specialization. The course provides a broad overview of key areas in machine learning, including regression, classification, clustering , recommender systems and deep learning, using short programming case studies as examples. The course assumes basic Python programming skills and it uses a software package called GraphLab that requires a 64-bit operating system running Python 2.7. Grades are based on periodic comprehension quizzes and short programming assignments.
The course covers a broad range of machine learning topics at a high level with the promise of drilling down into the details in future courses in the specialization. The lecturers have good chemistry, but they tend to get distracted when they are on screen together. The video and slide quality are very good and although the delivery is a little rough around the edges at times, the lectures are informative. The machine learning methods covered aren’t necessarily treated as complete black boxes, but the course intentionally avoids getting too deep into the details, putting the emphasis on conceptual understanding.
The weekly labs are contained in short IPython Notebooks—interactive text and code documents rendered in a web browser—that illustrate some simple models in GraphLab. The labs themselves are easy and don’t require much coding other than calling various built in GraphLab functions. The hardest part about the class is getting your programming environment set up in the first place. If you don’t have a new version of 64-bit Python 2.7, you can’t run GraphLab. It is relatively easy to get set up if you can use the recommended Anaconda Python distribution, but getting things set up manually on an existing Python installation may prove troublesome. The instructors provided some workarounds for doing the course without GraphLab or using GraphLab on Amazon’s cloud computing service; I wouldn’t take the course without getting GraphLab working in some form. Many students decried the use of a non-open source package for an open class; I think it is useful to be exposed to new tools and GraphLab seems cleaner than Python’s popular scikit-learn package. In this sort of course, the focus should be one concepts rather than syntax.
Machine Learning Foundations: A Case Study Approach achieves its goal of introducing machine learning at a high level without rushing or trying to cram too much into any particular week. What the professors lack in terms of polish they make up for with enthusiasm. Compatibility and setup issues will be a roadblock for some, but overcoming them is worth it.
I give Machine Learning Foundations: A Case Study Approach 4.5 out of 5 stars: Great.

Data Visualization and Communication with Tableau is the third course in Duke University's "Excel to MySQL: Analytic Techniques for Business" specialization offered on Coursera. The 5-week course starts is essentially an introduction to Tableau (weeks 2 and 3) book-ended by some lectures on considerations and best practices for communicating data insights in a business setting (weeks 1 and 4.). The final week is devoted to a peer-reviewed assignment and has no new lecture content. The course provides you with a free temporary license for the desktop version of Tableau. You can get through his course without any background knowledge, although some knowledge of MS Excel will help you appreciate some of the comparisons it makes. Grading is based on 4 weekly quizzes and a peer graded assignment.
Data Visualization has quality lectures that do a good job introducing Tableau in the context of creating visualizations for a business context. The Tableau walkthroughs are easy to follow and give you an appreciation for how much easier it is to make nice visualizations in Tableau than it is in Excel. You same data sets for the entire course, one data set for walkthoughs and one for homework assignments, which provides a nice sense of consistency. Weeks 1 and week 4 raise some useful considerations to keep in mind when preparing for and presenting a data analysis, but the Tableau sections in weeks 2 and 3 are the heart of the course. I would have preferred more content covering ins and outs of Tableau instead of the 2 weeks spent on communication topics, but the mix is probably about right for business-oriented students.
I give Data Visualization and Communication with Tableau 4 out of 5 stars: very good.

Data Science and Machine Learning Essentials is a 5-week introductory data science course offered by Microsoft through edX that focuses on teaching students how to use Microsoft's cloud-based machine learning platform, Azure ML. The course divides content into two tracks, an R track and a Python track, so you can complete the course with either language, but you'll need to know the basics of at least one of the two. Grading is based on 5 weekly reviews and a single 20 question exam.
The course title "Data Science and Machine Learning Essentials" is misleading because this course is not really about data science or machine learning per se. The first week attempts to cram an entire machine learning course or two worth of concepts into a handful of mediocre lectures, while the remainder of the course is all about Azure ML. Weeks 2-5 provide a nice overview of Azure ML and the fact that it has full lectures for both R and Python is a great feature that surely took a lot of extra time and effort to produce. The main lecturer's presentation skills aren't the best, but the videos are still easy to follow. Azure ML offers a lot of interesting functionality, like the ability to use Python and R scrips in the same project and publish projects as web services, but some of the exercises were tedious and ran slowly.
If data "Data Science and Machine Learning Essentials" were renamed "Intro to Azure ML" and only included the content in weeks 2-5, it would be a good course. Weeks 2-5 are definitely worth checking out if you are interested in Azure ML. As it stands now, however, the first week bombards students with far too many concepts explained too quickly to foster real understanding and sets the wrong expectations for the remainder of the course.
I give Data Science and Machine Learning Essentials 2.75 out of 5: mediocre.

Excel for Data Analysis and Visualization is an intermediate level course offered by Microsoft through the edX platform that covers cutting edge techniques for gathering, transforming and viewing data in Excel. The course focuses on getting students up to speed with new features and techniques offered in Excel 2016, such as the Excel data model, queries, DAX (a syntax of defining functions) and Power BI, an online productivity service that integrates with Excel. This course assumes you have some familiarity with MS Excel, particularly pivot tables and slicers. You can complete the course with Excel 2010 or 2013, but if you don't have Excel 2016 you'll have to download add ins and you'll have to work slightly harder to complete the assignments. Grading is based on 7 weekly labs and 12 comprehension quizzes.
Weekly content in DAT206x consists of one to three short video lectures describing new Excel features followed by a comprehension quiz. The amount of video content per week is usually under 30 minutes, so you shouldn't need to commit more than an hour or two a week to complete the course. The lecture videos have adequate resolution to see cell values and lecturer's presentation is easy to follow. Weeks 1-7 have lab assignments that let you apply the techniques presented lecture. You only get a couple of submissions for most lab and quiz questions, but most questions are not too difficult.
Excel for Data Analysis and Visualization is a succinct, informative course on new Excel features that is worth checking out for those interested in going beyond the basics. Using Excel 2016 for this course when it launched only a few months before the course debuted may partially be a ploy to convince Excel users to upgrade, but I can't fault Microsoft for teaching with the latest version of their own product, and I completed the course with Excel 2010 without much difficulty.
I give Excel for Data Analysis and Visualization 4 out of 5 stars: very good.

Data Visualization is the fifth and final course in the data mining specialization offered by John Hopkins University on Coursera. The 4-week course provides a high-level overview of data visualization, covering topics like human visual perception, basic plotting constructs and design principles, visualizing networks and visualizing databases. The course doesn’t have any particular prerequisites, but knowing how to make plots with some software package or programming language will be helpful for the assignments. Grading is based on two quizzes and two peer-graded visualization projects.
The lecture content in Data Visualization is better than the lectures of the previous courses in the data mining specialization. The instructor is easy to understand and there isn’t as much dense technical content to absorb. On the downside, since the course focuses on high-level concepts, you won’t learn how to actually construct your own visualizations. It’s up to you to pick out software and figure out how to make visualizations with it. It would have been preferable for the entire data science specialization to pick a programming language and stick with it throughout to pair concepts with specific implementations and exercises.
Data Visualization is a nice introduction to visualization at a high level, but the lack of low-level technical instruction and exercises limits its practical usefulness, especially for students who don’t already know how to create their own visualizations. The course is relatively smooth end to what is otherwise a rocky specialization, but since the content has no real connection to the other courses in the data mining track, you could take it as a standalone course.
I give Data Visualization 3 out of 5 stars: Fair.

Statistics for Business I is a spreadsheet-focused statistics course offered by the Indian Institute of Management, Bangalore through the edX platform. The course spans 5 weeks including 4 weekly lessons and one week for a final exam. Course topics include descriptive statistics, variable summaries, the shape of distributions and probability. The course has no prerequisites other than having access to Microsoft Excel. You may be able to get by with a free alternative like LibreOffice, but the course lectures use Excel. Grading is based on lecture comprehension questions, exercises, caselets and a final exam.
Weekly content in Statistics for Business I consists of a series of relatively short lectures interspersed with comprehension questions, followed by several exercises and caselets to let students apply what they’ve learned. The lectures themselves are well-made strike a good balance between instructor face time and showing spreadsheet operations. The lead instructor, Shankar, is easy to understand and has some lighthearted yet instructive interactions with is brainy assistant Lysa (she’s a plastic brain that sits on his desk.). Each week has a ton of comprehension questions and exercises to let students get practice with the spreadsheet operations and concepts presented in lecture. Hands-on practice is essential for skill building, so having plenty of exercises is a good thing.
Statistics for Business I starts out slow, but the pace picks up toward the final lessons. Some students might feel that the last couple of lessons cover too many concepts in one week. Although having plenty of exercises is generally a good thing, the large number of easy, repetitive exercises grew tiresome. The course might benefit from making some of the exercises optional so that students who need more practice can get it, while those who don’t can skip ahead.
Statistics for Business I is a good course for learning how to deal with numbers in Excel, but the large of number of graded exercises can make things tedious at times. This course is best suited for beginners in statistics with basic knowledge of spreadsheets and those who know some statistics and want more experience using Excel. Statistics for Business II is set to launch in October 2015.
I give Statistics for Business I 4 out of 5 stars: Very good.

Scalable Machine Learning is a 5-week distributed machine learning course offered by UC Berkeley through the edX platform. It is a follow up to another UC Berkely course: Introduction to Big Data with Apache Spark. Although the first course is not a strict perquisite, Salable Machine Learning uses the same virtual machine and even has some overlap with the homework labs, so it is beneficial to take Introduction to Big Data first. Scalable Machine Learning teaches distributed machine learning basics using Pyspark, Apache Spark’s Python API. Basic proficiency with Python is necessary to pass the course and some exposure to algorithms and machine learning concepts is helpful. Course evaluation is based primarily on 5 labs distributed as iPython notebooks.
The first two weeks of the course cover machine learning basics and introduce Apache Spark. For students already familiar with machine learning basics who took Introduction to Big Data, there’s not much new to learn during first two weeks. Week 2 is essentially an exact clone of week 2 of the intro to big data course, including the lab assignment. The final 3 weeks have meatier lecture content and longer labs, each covering a different machine learning technique--linear regression, logistic regression and principal component analysis.
The lecture content is clean and the lecturer speaks clearly. His delivery isn’t perfect, but the only real purpose of the lectures is to serve as background information for the meat of the course: the labs. Each lab is a lengthy iPython notebook with several sections leading you through the process of creating a pipeline for running a machine learning algorithm with Pyspark. Much of the code you need is provided for you, but writing the key functions and data transformations necessary to complete the labs can still be time consuming. Little things like an ambiguous instruction or uncaught error you made earlier in the assignment can result in bugs that take a while to squash. Despite occasional frustrations, the labs do a good job interspersing instruction with practical, hands-on learning.
Scalable Machine Learning is a quality introduction to machine learning with Pyspark that focuses on labs over lectures. The lectures could be better and some of the instructions and error checks in the labs could be more comprehensive, but this is a great course for those looking to learn by doing.
I give Scalable Machine Learning 4 out of 5 stars: Very Good.

CS100.1x Introduction to Big Data with Apache Spark is a 5-week intro to distributed computing offered by UC Berkeley through the edX MOOC platform focused on teaching students how to perform large-scale computation using Apache Spark. The assignments use PySpark, Spark’s Python API, so some familiarity with Python programming is necessary. You don’t need prior exposure to big data or distributed computing to take the course. Grades are based on four programming labs (80%), easy comprehension questions that allow unlimited attempts (12%) and setup of the course virtual machine used to complete the labs (8%).
Course lectures in to Big Data with Apache Spark are relatively brief and tend to stay at a high level, discussing general big data concepts rather than the details of Apache Spark. The instructor does a fine job in the few lectures the course offers, but there were not enough of them and they often felt disconnected from the assignments. The fifth week had no lectures.
The labs are the core of this course. While you can breeze through weekly lectures in half an hour or less, each of the four labs are lengthy reading and programming assignments packaged in IPython notebooks. Expect to spend 2 to 4 hours on labs 1, 2 and 4 and 3 to 6 hours on lab 3. The labs start by teaching basic Apache Spark manipulations and move on to some text analysis and machine learning. Using the IPython notebook to deliver labs is a convenient way to intermingle text and instructions with code. On the other hand, each exercise tends to depend on code executed somewhere above it, so a mistake made on earlier exercise can lead to some odd errors later on and Spark’s error traces aren’t particularly helpful. The course does provide some basic tests for each exercise, but it is easy to arrive at solutions that pass the checks but cause errors later on. The course forums on Piazza are a vital resource for troubleshooting and disambiguation; I imagine some of the snags will be resolved in future offerings. Despite the occasional hiccups, the labs do a good job familiarizing students with Apache Spark’s Resilient Distributed Dataset objects and the various transformations and actions you can perform with them.
Introduction to Big Data with Apache Spark is a great place to start learning about distributed computing if you know some Python. Although the lectures don’t add much technical depth to the course, they provide some big picture background that will be useful for students who have little prior exposure to big data concepts. The labs give you adequate opportunity to get your hands dirty with Apache Spark to gain basic familiarity with data manipulations it offers. UC Berkley is offering a follow-up course “Scalable Machine Learning” that builds on the foundation laid in CS100.1x.
I give this course 4 out of 5 stars: Very Good.

6.041x: Introduction to Probability - The Science of Uncertainty is a comprehensive 16-week introduction to probability offered by MIT through the edX MOOC platform. Although this course is dubbed an “introduction” it is not easy. You need familiarity with differential and integral calculus to understand some of the material, and the course can easily take 10-15 hours per week. Given its 16-week duration, the time commitment required to get through everything is much higher than the average MOOC. The course touches on all the major topics you need to gain a solid understanding of probability including basic axioms of probability, conditional probability and independence, discrete and continuous random variables, Bayesian inference and the probabilistic underpinnings of classical statistics. The course grade is based on lecture comprehension questions, weekly homework assignments, 2 midterms and 1 final exam. The midterms are worth 15% apiece and the final is worth 30% so good performance on the exams is paramount to getting a good score. You need a total of 60% to pass and it isn't quite as easy to achieve that mark as it is in most MOOCs.
Weekly content consists of 2-4 lecture sequences covering different aspects of a particular topic in probability. Each lecture sequence contains about an hour of video in 5 to 15 minute segments and most video segments are followed by graded comprehension questions. The lecture videos themselves are crisp and the professor is good at explaining the material at a pace that doesn't overload you with too much information too quickly. There can be quite a bit of mathematical notation on the screen at times, but it is well-organized. Each week also has a series of solved problem videos where TAs walk you through applying the material in lecture to problems that are similar to those you will see in the homework. The solved problems sections add another 1 to 2 hours of video content per week.
Pure math courses usually aren't that fun because they spend a lot of time dealing with proofs and theory and not so much time dealing with the real world. This course can be a slog at times because it is long and there is a lot to absorb and remember, but after building up the basic tools of probably in the first few weeks, later weeks focus on more interesting extensions and applications. You won’t find another intro to probability with greater depth and breadth. This course is best suited for technical and math-minded people who will have to work with and apply probability in future coursework or in their professional lives. If you're looking for an intro that just gets you up to speed on the rudiments of every-day probability like coin flipping and dice rolling this course is overkill.
6.041x: Introduction to Probability is a great course for those serious about forming a solid foundation in probability. As professor Tsitsiklis states early on, "the first step in fighting an enemy like randomness is to study and understand your enemy." At the end of this course you will be armed with the tools necessary to wage a well-reasoned war against uncertainty.
I give 6.041x: Introduction to Probability 5 out of 5 stars: Excellent.

Applications of Linear Algebra Part 2 is the second part of an introductory linear algebra course offered by Davidson University through the edX MOOC platform. The course spans 6 units and runs for 6 weeks, but all the lecture content and activities are available as soon as the course opens. The topics presented in part 2 build on the foundation laid in part 1 and include: least squares, correlation, eigenvectors, singular value decomposition, Markov chains, principle components analysis and sports prediction.
Applications of Linear Algebra Part 2 follows the same pattern as part 1: each week consists of 2 to 3 short lectures, each with a corresponding activity that illustrates an application of the topic covered in lecture. This formula worked well in part 1 because the topics were relatively simple and the activities were provided via basic web apps. In part 2, the concepts are more complicated--too complicated for students to develop a solid understanding of them after one short lecture video. In addition, most of the activities in part 2 require running code in MATLAB. The course provides a free MATLAB license and tutorial videos, but it takes more effort to jump into activities. On the plus side, once you get them up and running, the applications in part 2 are even more interesting and fun to play with than the activities in part 1.
Professor Chartier is personable and engaging in the lectures despite following a prompter/script. Although his voice is clear, he spends a bit too much time reading off the numeric contents of matrices, when it would more instructive to have the matrices and other information on screen in persistent slides. Given the complexity of the material and brevity of the lecturers, students aren't likely to fully understand the math unless they have taken a course in linear algebra before. I suspect the lectures are going to leave a lot of students scratching their heads. It might have been wiser for the course not to purport to teach all the math behind the applications, but instead give a general overview of concepts before each activity and provide resources/references for students to learn about the math in greater detail. I don't normally advocate hand-waving, but as a course prioritizing applications over mathematical understanding, there are some instances where it may have been warranted.
Overall, Applications of Linear Algebra Part 2 is another solid course that has a lot of interesting activities, but it is not as approachable as part 1 and tends to rush through complicated topics to get to interesting applications.
I give Applications of Linear Algebra Part 2 a score of 4 out of 5 stars: Very Good.

CS188.1x: Artificial Intelligence is an introductory AI course offered by UC Berkeley through the edX MOOC platform. CS188.1x covers roughly the first half of the material in the full on-campus AI course in the span of 12 weeks. Major course topics include search algorithms and heuristics, constraint satisfaction problems, Markov decision processes and reinforcement learning. The course assumes you have taken a first course in algorithms, are familiar with basic data structures, have basic python programming skills and are comfortable with mathematical notation. There isn't any particularly hairy math, but there are a lot of variables and symbols flying around at times. Grading is based on weekly homework assignments that allow unlimited attempts, 3 programming projects and a final exam that allows 1 or 2 attempts per question.
CS188.1x is a direct adaptation of the on-campus AI course. The lecture videos are edited versions of lectures delivered on-campus but instead of seeing the professor, we mostly see the presentation slides themselves with a voice-over from the professor. Direct adaptations of on-campus courses don't always work so well with MOOCs, but this course pulls it off perfectly. The professor speaks clearly and explains topics well. The lecture slides are extremely well-made, with clean text and even a bunch of cute robot and pacman art to go along with the content. The videos are cut down into digestible 5 to 15 minute segments and there are practice comprehension questions following most of the videos that allow you to take a second to reflect and digest the content.
Many courses that have great presentation fall flat when it comes to assignments. This is not one of those courses. The three pacman-themed programming projects are among the best programming assignments I've encountered in any online course. Each project consists of several parts that involve implementing AI algorithms you study in class in the context of a pacman game. The course provides you with all the code you need to run the game, a variety of convenience functions and skeleton code that you have to fill in with algorithms that accomplish the prescribed tasks. The assignments can be frustrating at times, but seeing your code in action with a little pacman racing around gobbling food pellets and ghosts is surprisingly gratifying. It also helps you gain a better understanding of how the algorithms work.
Berkeley CS188.1x: Artificial Intelligence is one of the best MOOCs on the web. It is so good that many students on the forums were eager to take part 2. Unfortunately the professors haven't gotten around to adapting the second half of the full AI course into a MOOC (they did express the desire to do so in the future) but they will give you access to an archived version of the full course upon request.
I give Berkeley CS188.1x 5 out of 5 stars: Excellent.

Discrete optimization is a quasi-self-paced programming course offered by the University of Melbourne through Coursera that is all about solving hard problems. Hard problems in the context of this course means NP-hard problems--problems with exponential worst-case running times. The course differs from most classes on Coursera and elsewhere on the web in that all the materials are available as soon as the course opens, but there is a final deadline for the programming assignments, so it is not a self-paced course in the truest sense. The entire course grade is based on 5 programming assignments: the knapsack problem, graph coloring, traveling salesman, warehouse location and vehicle routing. An average score of 7 (out of 10) on each part of each programming assignment is required to earn a certificate.
Discrete optimization opens with an introductory lecture series on the knapsack problem that lasts a couple of hours followed by three longer lecture series, covering constraint programming, local search and mixed integer programming. The lectures do not need to be viewed in any particular order. Similarly, students can work on the homework projects in any order they choose. This level of freedom is great for students who want to work ahead but it may make it difficult to complete the course if you don't plan ahead because the programming assignments can be very time consuming. The assignment skeleton and submission code is written in Python 2.7, but you can use languages if you want.
The professor, Pascal Van Hentenryck, is extremely energetic and passionate about the subject. He makes the lecture videos surprisingly fun for such a dense subject. The lecture videos themselves are well-made and the professor does a good job explaining the material, although I sometimes felt like the course was trying to cover too many different topics and it wasn't always clear how one would go about applying the methods in lecture to the assignments or using them without using some external package or solver. A little more instruction and direction in that regard would be helpful.
Discrete optimization is challenging course with great programming assignments that introduces many different tools and leaves them on the table for you to play with. The tools don't always with full instruction manuals, so you'll have to figure out many of the details yourself. You won't have time to apply every tool to every problem, but if you focus on one and budget your time well, you'll have a good shot at making it through.
I give discrete optimization 4 out of 5 stars: Very Good.

Text Retrieval and Search Engines is the second course in Coursera's new data mining specialization offered by the University of Illinois at Urbana-Champaign. The course covers a variety of topics in text data mining and natural language processing including text retrieval, query ranking and evaluation methods, methods and the basics of recommender systems. Grading is based entirely on 4 weekly quizzes comprised of 10 multiple choice questions. You only get 1 attempt on the quizzes.
The weekly content in Text Retrieval and Search Engines consists of around 10 video lectures that range from 5 to 20 minutes followed by a short 10 question quiz. If that sounds like a lot of lecture per question, it is, and there are no in-lecture quizzes to reinforce concepts as you go along. The lectures themselves are definitely a step up from the first course in the specialization, Pattern Discovery in Data Mining. The professor isn't hard to understand this time around and he explains concepts well enough to grasp them without having to re-watch videos. As with many of Coursera's other 4-week specializations, however, lectures sometimes turn into information dumps where the professor ends up reading off slides. The course does have a C++ programming assignment which was nice to see.
Text Retrieval and Search Engines is a decent course that is worth a look if you are interested in text data mining and search engines. Although the lectures lackluster, they have some good information. If you're planning on getting a verified certificate, it is a good idea to try the practice quizzes before submitting the real one.
I give this course 2.75 out of 5 stars: Fair.

Discrete Time Signals and Systems, Part 1: Time Domain is a 4-week
introduction to discrete time signals offered by Rice University through the
edX platform. This course was originally 8 weeks, but edX split it up into two
parts, one covering the time domain and one addressing the frequency domain.
Major course topics include signal properties, signals as vectors, linear
time-invariant systems and convolution. The course requires some linear
algebra and calculus (it has a pre-course assessment) as well as some basic
programming in MATLAB. You don't need to know any MATLAB going in, but if you
do you can skip the tutorial. Grading is based on a combination of
comprehension questions, homework quizzes, peer graded free responses and a
final exam. All of the course content other than assignments is available
immediately so you can work ahead if you want to. Discrete Time Signals and
Systems started around the same time as a similar signal processing course on
Coursera called "Digital Signal Processing." I found Discrete Time Signals to
be much more approachable than the Coursera course; it introduces concepts at
a steady but manageable pace and doesn't overload you with math right out of
the gate. The course isn't easy, but it isn't too difficult. The lecture
videos are well-done and the instruction is very good, although some videos
could stand to be broken up into multiple parts. Professor Baraniuk tends to
stutter, but it didn't really bother me or detract from the quality of the
instruction. The MATLAB programming questions are baked right into the edX
website and let you get some hands-on experience with the concepts. The final
exam is "closed book" which I think is a mistake as it promotes guessing over
learning. All in all, Discrete Time Signals and Systems Part 1 is an excellent
introduction to signal processing that is likely to be more accessible than
other courses on the same subject you may find elsewhere. The stage is set for
a deeper dive into signal processing in Part 2.

Applications of Linear Algebra Part 1 is a light, activity-focused
introduction to linear algebra. This course is suitable for anyone who is
curious about what linear algebra is and how it can be used in the real world,
including high school students and advanced junior high students. The course
doesn't go deep into the math, but rather focuses on thinking about data in
terms of matrices and illustrating linear algebra operations with activities.
The materials span 7 units that include activities ranging from image
manipulation and animation to cryptography and sports prediction. Grading is
very relaxed as you have unlimited attempts on comprehension quizzes and the
remainder of the points are based on the activities. If you've taken a linear
algebra course before, this class will be very easy, but you can still get
some entertainment out of the activities and learn a bit about sports
prediction. One of the biggest failings of math education is a heavy focus on
rote repetition, which disconnects math from the real world and makes it
boring. Applications of Linear Algebra is the type of course that is needed to
raise interest in math. It introduces concepts at a digestible pace suitable
for beginners and almost every lecture video that teaches a new concept is
followed by an activity devoted to seeing that concept in action. Professor
Chartier is clear and personable even though he seems to be working off a
script--something that is not easy to do. The video quality is good and the
activities, while simple, are illustrative. Applications of Linear Algebra
Part 1 is great course to get beginners interested in linear algebra by
getting their hands on fun activities as quickly as possible. I hope to see
Professor Chartier carry the same formula into Applications of Linear Algebra
Part 2 and build upon the foundation laid in part 1.

Pattern discovery in data mining is the first course in a new 5-part data
mining specialization offered by the University of Illinois at Urbana-
Champaign through Coursera. Keeping with the trend of other specialization
courses, pattern discovery in data mining spans 4 weeks and will likely be
offered again each month or two after the first offering. The course covers a
range of methods for finding different types of patterns in data, such as
association rules and patterns in graphs. Grading is based exclusively on 4
weekly quizzes. I was excited to see the new data mining specialization come
up on Coursera to kick off 2015, but unfortunately, pattern discovery in data
mining is a dull, poorly executed information dump. Besides an interesting
topic, there’s not much going for this course. In the lectures, the professor
reads information off dense slides and his delivery is more confusing than
instructive. The slides, video and sound are of decent quality, but the
explanations are not clear and while I normally don't have an issue with
foreign accents, the professor's English made things harder to understand. To
make things worse, there are few instructive in lecture quizzes and no
activities or programming assignments. A course about data mining should have
programming assignments or activities that let students interact with the
concepts to reinforce learning. Pattern discovery in data mining is a
disappointing start to the data mining specialization, that suffers from poor
instruction quality and lack of illustrative assignments. Taking this course
is like a data mining problem in and of itself: you have to spend a lot of
time deciphering the lectures to uncover useful information.

Learning How to Learn is a 4 lesson self-paced course that summarizes key
findings in neuroscience about how we learn. The course touches on brain
function, working and long-term memory and various methods for improving
learning as well as overcoming hurdles like procrastination. The lecture
content in learning how to learn is very good. Videos aren't too long, the
lecturer is clear and personable and everything is easy to understand. There
are more bonus/guest lectures than you'd see with a typical MOOC and I find
engaging, memorable guest lectures are rare. Also, you can't fully complete
the course unless you verify your identity before submitting quizzes, even if
you don't want a verified certificate. One of the main pitfalls with MOOCs is
that you can get into the habit of watching hours of lecture content without
taking time out to practice, recall and commit ideas into long-term memory.
Good courses help students learn with quizzes and homework; this course
teaches students other things they can do, such as making flash cards, taking
breaks and getting adequate sleep, to maximize learning. Considering the main
lecture content only takes a few hours complete, this course offers a good
amount of value for your time.

Intro to relational databases is a short 4 lesson course that covers the
basics of SQL databases. Lessons 1 and 2 cover basic SQL querying, including
grouping, ordering and inner joins, lesson 3 addresses inserts and concerns
when using a database backend for a webapp and lesson 4 covers database design
principles and a few more advanced features like outer joins and subqueries. I
won't get into the final project as Udacity's projects tend to be geared
toward students with subscriptions. Each lesson consist of several short
videos with quizzes that involve multiple choice questions and coding
exercises that revolve around altering and submitting SQL queries. The
instructor is easy to understand and explains things well. The content is
polished and I didn't notice any bugs, which is rare for a brand new course.
On the other hand, the course is a bit too short and doesn't give beginners
enough practice with newly introduced syntax before moving on. It would be
helpful to give students a few short drills writing queries related to each
newly introduced keyword from scratch. Also, to follow along with lesson 3,
you have to download, install and interact with a virtual machine. The time
necessary to download, install and figure out how to use the VM is probably
more than is warranted with such a short course, although the VM may be used
for other Udacity courses. Intro to relational databases is a succinct
overview of SQL basics that serves as a nice refresher for someone who has
seen SQL before, but making it a little longer and providing more simple
drills would probably be helpful for beginners.

Model Building and Validation is an advanced data science course provided by
AT&T through Udacity. The course is listed as "advanced" because it assumes
prior knowledge of machine learning, statistics, linear algebra and calculus.
Despite the stated prerequisites, math doesn't play a large role, so you will
still be able to understand most of the content even if your only preparation
is Udacity's intro to machine learning. The course spans 4 lessons that detail
the process of extracting value from data through questioning, modeling and
validation. Lesson 1 is a general introduction to the QMV process with each of
the following lessons digging into each component of QMV in more detail. The
course somewhat oversells its length as none of the lessons take more than a
few hours despite the course being listed at an estimated 8 weeks with 6 hours
of study per week.
Model Building and Validation follows the same formula as other Udacity
courses, with each lesson taking the form of a series of short lecture videos
interspersed with quizzes. The lecturers are easy to understand and the video
quality is generally good, although the videos and course materials have some
glitches that need to be ironed out. I won't grade the course too harshly on
bugs, since all courses are buggy at the very beginning, and they will likely
be fixed in the near future.
As for the content itself, the simple idea of framing a data analysis as a
tree to track and organize the decisions you make along the way is probably
the most useful thing you'll take away from this course. The course also does
a good job getting students to think about some of the high-level decisions
that must be made when conducting a data analysis. The content gets rockier
when it delves into specifics after lesson 1, particularly in the models
lesson. The lectures occasionally dive too quickly into the low level details
of machine learning techniques that students may not have seen before.
Additionally the validation section focuses much more on model evaluation
metrics like ROC curves, the confusion matrix and derived metrics that fall
out of it, than validation itself.
Model Building and Validation is a good course that provides a nice framework
for approaching data analysis, but it gets bogged down in some machine
learning specifics that don't add much to the overarching theme.

Social and economic networks is an introductory network theory and analysis
course geared toward learners who have are comfortable with basic statistics,
probability and linear algebra. You don't need to know anything about social
networks ahead of time to take this course, but having basic familiarity with
networks will help things go a bit smoother. The course has 7 weeks of lecture
content covering network basics, measures of centrality, network formation
models and diffusion, learning and games on networks. You'll also be
introduced to Gephi, a software tool for network visualization and analysis.
The 8th week is reserved for a final exam.
Social and economic networks provides all the raw information you need to get
a solid grounding in network theory and analysis, but the presentation style
is impersonal so the content is not particularly engaging. The professor is
knowledgeable and appears on screen while explaining lecture slides, but he
shows little emotion. While the lectures can get a bit intimidating with
equation after equation, the homework exercises and final exam are easier than
the lectures might suggest. You get 2 attempts on each chapter quiz and 1
attempt on the final; a score of 70% or more is required for a certificate
and 90% or more will earn you a certificate with distinction.
All in all, social and economic networks is worthwhile course if you are
interested in social networks and aren't intimidated by a bit of math, but I
wouldn't take it for fun. If you want to take a course on the same subject
that is less mathy consider Coursera's Networked Life from UPenn.

Networked life is a gentle introduction to network/graph theory that covers
the basics of network structure, network formation models and networked games.
The course consists of 7 weeks of lecture content--typically three 8-20 minute
videos per week--with a 8-10 question quiz for each video. The quizzes aren't
too difficult and you get 2 attempts, but since there is one quiz for every
lecture video, you'll be spending a significant proportion of your total class
time answering quiz questions. The course doesn't get into network algorithms
or computing: it focuses on basic network structure, formation and games, so
you can take this course without any programming or math background. Networked
life debuted about 2 years ago, making it among the first courses available on
Coursera, so the presentation and slide quality are a bit dated. The lecturer
mainly reads directly off slides and you spend the majority of lecture time
looking at static slides written in Comic Sans as the lecturer explains them
in greater detail. The information is solid and generally interesting but the
presentation is often a bit dull when there are no illustrations on the
screen. The quizzes are probably the best part of the course; even though they
are easy they help reinforce the content and break what might otherwise become
a tedious slog through lecture video after lecture video. The course is self-
paced, so despite it having "7 weeks" of content, you can finish it faster if
you want to. Networked life is an accessible introduction to networks and
while the presentation isn't great, the topics are interesting and the
frequent quizzes help keep you engaged.

Machine Learning is one of the first programming MOOCs Coursera put online by
Coursera founder Andrew Ng. Although Machine learning has run several times
since its first offering and it doesn’t seem to have been changed or updated
much since then, it holds up quite well. This course assumes that you have
basic programming skills. Assignments also require many vector and matrix
operations and slides include some long formulas expressed in summation
notation so it is recommended to have some familiarity with linear algebra.
You don't need to know calculus or statistics to take this course, but you may
gain deeper insight into some of the material if you do. The course uses the
Octave programming language, a free to use clone of MATLAB. The course runs 10
weeks and covers a variety of topics and algorithms in machine learning
including gradient descent, linear and logistic regression, neural networks,
support vector machines, clustering, anomaly detection, recommender systems
and general advice for applying machine learning techniques. Lectures are
split into 3 to 15 minute segments with periodic quizzes and each topic
section has a corresponding quiz. Section quizzes are worth 1/3 of the total
grade but you get unlimited attempts (with a 10-minute retry timer.). Andrew
Ng does a good job explaining dense material and slides although the audio
levels are often too low. If you don' have good speakers you might need
headphones to hear him talk. The other 2/3 of the course grade is based on 8
multi-part programming assignments that typically involve filling in code for
key functions to implement machine learning algorithms covered in lecture. The
course gives you a lot of structure and direction for each homework, so it is
generally pretty clear what you are supposed to do and how you are supposed to
do it even if you don't understand 100% of the materiel covered in lecture.
Machine learning is a great course if you can get past quiet audio. If you've
never used Octave or MATLAB before, don't let that stop you from taking this
course; learning the basics necessary to do the assignments only takes a
couple of hours and it will help you think of things in terms of vectorized
operations.

The hardware software interface covers computing from the level of the CPU to
a low level programming language: C. Course content includes binary logic, C
basics, C structs and arrays, x86 assembly, the stack and heap, caches,
processes, virtual memory, memory allocation and differences between Java and
C. The course consists of lecture videos with periodic in-lecture questions
and several programming exercises. The presentation of material is good and
the professors are easy to understand. On the other hand, the lectures didn't
always cover everything you needed to know to tackle the homework; if you
don't come into this course with any C experience, you'll probably need to do
a bit of outside reading to tackle some of the homework. I also found myself
getting a bit bored with this course due some long puzzle-like programming
assignments and the low-level nature of the course. Overall, this is a quality
MOOC focused on low level computing--a topic that is not covered in many
online courses--but it takes a lot of time and attentiveness to complete all
the content.

Machine Learning 2—Unsupervised Learning is the second part of a 3 part
machine learning course offered by Georgia Tech through Udacity. It
recommended that you take the first part before this course as the lecturers
reference material from the first part from time to time. This course is much
shorter than part 1, spanning only 4 lessons: methods for optimization,
clustering, feature selection and feature transformation. There is also a
supplementary section on information theory. The course format and quality
mirrors part 1: the lecturers alternate taking on the role of teacher and
student and introduce new material at a quick clip. I'm not sure if "teacher
as student" works too well here, because the lecturers "catch on" almost
instantly while real students are likely to need a little more time. The
lecturers also have a tendency to dive into quizzes without adequate
explanation of the problem. There is a single homework problem set at the end
of the course that takes about 10 minutes and a final project about building a
recommendation system. I would have liked to have seen short homework sets and
programming exercises for each topic section, but the chemistry and wit of the
lecturers help keep you engaged. If you enjoyed part 1, you’ll enjoy part 2.

Statistical Inference is the 6th course in the John Hopkins data science
specialization track, which is basically an introduction to statistics in R.
The course covers many different topics in the span of 4 weeks from basic
probability and distributions to T tests, p values and statistical power. The
lectures take the form of slideshows with a lot of dense mathematical
notation, small text and mediocre voiceovers. The course tries to cover too
much ground too fast and the material isn’t presented in a way that is easy to
understand or engaging. I don’t think the lecturer’s face was shown once in
the entire course. That’s not to say there isn’t good information in the
lecture slides, but the presentation and execution are poor. If you’re looking
for a good introduction to statistics that uses R, try Duke’s Data Analysis
and Statistical Inference. Udacity’s “Statistics” is another solid option that
is self-paced, moves a bit slower and does not require programming.

Effective thinking though mathematics is a course about increasing your
ability to tackle new problems and understanding things you already know
better. The course focuses on 4 main elements of effective thinking:
understanding simple things deeply, making mistakes, raising questions and
following the flow of ideas. Although this course has “mathematics” in the
title, it is really about the process of thinking—math is just a convenient
arena to teach these methods. You don’t need any particular math background to
take this course and get a lot out of it, although be aware that most of the 9
weekly lessons deal, at least in part, with mathematical concepts like
numbers, infinity, dimensionality and geometry. The course format is a little
different from most MOOCs: each week consists of a series of videos where the
professor gives problems to students and the students attempt to work through
them. The professor helps the students reason though the problems by making
suggestions and asking questions and he periodically addresses the viewer,
explaining how the effective thinking methods were or should have been applied
by the students. The nontraditional course format may be off-putting to some
viewers, since the students spend quite a bit of time struggling with the
problems, making little progress. I found it to be an interesting approach,
although increasing the video speed is useful for times when things get too
slow. This lighthearted course introduces some intriguing concepts and lays
the foundation for approaching problems in a way that lets you gain new
insights and deeper understanding. The main issue I had with the course that
it did not provide enough challenging puzzles and homework problems for MOOC
students to work on to apply the methods discussed in lecture. Everything felt
a bit too easy. Despite that, this is a fun course that teaches methods that
could be useful in almost any sphere of life and doesn't require a big time
commitment.

Calculus One is a comprehensive introductory calculus course that covers
everything you'd expect in a first year university calc class: limits,
derivatives, integrals and applications for both. The instructors have a lot
of passion for the subject and provide plenty of examples to help students
learn the material. They also have a nice interactive quiz platform called
MOOCulus that lets you go through practice problems online to your heart's
content. This is a great course for anyone seeking to learn calculus for the
first time or relearn later in life. The only downside to this course is that
it is longer than most MOOCs--16 weeks--so it can be hard to keep up with the
weekly schedule. If you take a lot of MOOCs, you may find that you get too
busy with other newer ones to stick to the schedule.

Intro to statistics is one of Udacity's older courses and while it was one of
the few free stats courses on the web when it was released, it has more a lot
more competition today. Intro to stats is a decent course that covers some of
the most basic topics in statistics. The course fairly slowly with periodic
spurts of difficulty. While most MOOCs underuse interactive elements, I found
that this course had too many in-lecture quizzes, which just end becoming
tedious. If you're looking for a basic intro to stats, Udacity's other stats
course "Statistics" is a better option.

Udacity's "Statistics" is provided by San Jose State University and offers a
comprehensive introduction to statistics. This course should not be confused
with Udacity's "Intro to Statistics" taught by the founder of Udacity,
Sebastian Thurn. Topics covered in this course include research methods,
visualizing data, measures of center and spread, z-tests, t-tests, ANOVA, chi-
squared test, correlation and regression. This course has a ton of content
that is well presented and covers each topic in great detail with a many
quizzes and homework exercises after each lesson to reinforce learning. The
pacing of this course is fairly slow, so it is perfect for someone who has
never taken a statistics course before or someone who isn't super confident in
their math skills. Just be aware that completing all the content will take a
significant time commitment, likely 60-100 hours. I would recommend this
course over Udacity's “Intro to Statistics.” If you want a course that moves
faster and gives you the chance to do some computation, I recommend Data
Analysis and Statistical Inference offered by Duke though Coursera.

Getting and cleaning data is the third course in the first wave of John
Hopkins’s data science specialization track on Coursera. It is recommended
that you take this course after taking the data scientist's toolkit and R
programming courses. The title of the course pretty well sums up the content:
the entire class is about loading data into R and cleaning it up so that it
can be used of data analysis. You'll learn how to load various data formats
into R, such as json, xml, csv, excel files and get data from other sources
like MySQL and web APIs. The course also discusses subsetting data, adding
variables, merging data, regular expressions and working with dates. This
course is a good summary of many of the things that are useful to know when
trying to access and prepare data for analysis. Similar to R programming, it
suffers from overuse of static slides with voice-overs, a lack of instructor
face time and a lack of interactive content or in-lecture quizzes to help you
learn and retain as you go along. You'll be introduced to many R packages and
syntax that you probably won't remember after a week or two, but you'll be
exposed to many common data formats so that you can refer back to the course
materials or other web resources to deal with them in the future.

A beginner’s guide to irrational behavior provides a nice overview of key
topics in behavioral economics, including money, dishonesty, motivation, self-
control and emotion. I only watched the video lectures for this course, so my
review won’t touch on the readings, quizzes or assignments. The lecture
content is engaging and raises many interesting ideas and questions about the
way people think and act. Just be sure to maintain a healthy degree of
skepticism and realize that the professor is only sharing his views based on
his research and experience.

Exploratory data analysis is the third course released as a part of Udacity's
new Data science focus area that launched at the beginning of 2014. The course
provides an overview of using R to explore data and focuses heavily on the use
of the ggplot2 package in R to create data visualizations. Although the course
touches briefly on high-level theory and concepts like summary statistics,
transforming data, correlation and linear regression, almost all of the
quizzes and homework questions have to do with creating plots and making
observations based on plots. This is not necessarily a bad thing--learning to
plot in R is a valuable skill and an important part of exploratory data
analysis--but it seems like the course should have spent a bit more time
covering high-level concepts and numeric methods for exploring data like using
tables and summaries. Despite that quibble, this is good course with a lot of
high quality and practical content. It moves slowly enough for you to get
comfortable with basic potting syntax before building up to more complex
visualizations, but fast enough to keep you engaged. Be aware that the course
mainly uses two data sets to teach the material: a data set of diamond prices
and characteristics and set of pseudo Facebook data created by the instructors
meant to mirror real Facebook data, such as friend counts, tenure on the site,
user age and gender. Your enjoyment of the class will depend, in part, on your
interest in the data.

The Data Scientist’s Toolbox is essentially just an overview of the data
science specialization track offered by John Hopkins University through
Coursera. The track consists of 9 courses that each last about 4 weeks which
are released in batches of 3 courses each month. This course introduces the
very basics of R and R studio, Git and Github and a few other things that will
be used in the data science specialization. It is basically a bunch of
introductory and supplementary material that shouldn't be a standalone course.
You can complete all the lecture videos in the entire course in about 2 hours.
It's almost embarrassing that John Hopkins has a paid verified certificate
option for this course; what's worse, it is required to complete their data
science specialization track. I suspect this will be a major turnoff for
students interested in the track.

Intro to data science is an intermediate level course that assumes basic
Python programming skills and knowledge of statistics. The course focuses on
gathering, manipulating, analyzing and visualizing data using Python and
various Python packages such as numpy, scipy and pandas. One of the best parts
about this course was getting some exposure to some Python packages in the
scipy stack, although I wish more time was devoted to explaining what the
various modules in the scipy stack do, how to set them up at home and when to
use them. The first lesson is a fairly gentle introduction with an interesting
homework project dealing with data from the Titanic disaster. Lesson 2 goes
into more detail about gathering and cleaning data using Pandas and an
additional module that lets you make SQL-lite queries to extract data from
Pandas data frames. Lesson 3 jumps into data analysis with a T test and linear
regression using gradient descent. Going from basic data manipulation into
these topics was a bit jarring in terms of difficulty and more time could have
been spent explaining how the functions worked. I left without a great
appreciation of what gradient descent is really doing. Lesson 4 is focused on
making visualizations using a module that attempts to port the functionality R
language’s ggplot2 plotting package. Finally, lesson 5 introduces the concept
of big data and MapReduce as a solution to deal with large data sets. Each
homework assignment after the first has students dealing with New York subway
turnstile data, which allows you to get some level of familiarity with the
data throughout the course. This was a very good decision, since it lets you
focus on learning new concepts rather than spending time familiarizing
yourself with new data sets over and over again. Intro to data science
introduces some major topics in data science and does a pretty good job given
the amount of content it offers, but coverage of the topics is too brief.
Hopefully the forthcoming Udacity courses Exploratory Data Analysis and Data
Wrangling with MongoDB will build on the foundation provided by this course
and give students a bit more depth.

A fun, short introduction to design of objects. My main complaint is that the
course is quite so short and yet they want you to do fairly involved final
project. The size of final projects should be proportional to the amount of
material and effort put into the class before the final project. A course with
only 8-9 hours of material shouldn't have a 7+ hours on a final project.

I took this course through Yale's open courseware back in 2010 before most of
today's big MOOC platforms existed. It is a 26 lecture philosophical
discussion of death. A very interesting class for anyone interested in death.
There's no work to complete other than watching the lectures, which takes
around 20 hours.

A short and sweet introduction to JQuery. It's recommended to take the
Javascript and HTML & CSS courses first. This is one of the better codeacademy
offerings, but as usual it only covers the very basics and you can complete
the whole thing in one sitting.

A basic introduction to Python. Codeacademy has improved its materials a bit
since they first launched; this is a decent course for learning basic syntax,
functions and data structures. It's a good place to start to get a little bit
of familiarity with Python before taking a full-length intro to CS course that
uses Python.

I took this course back in 2012 when Codeacademy first came out. The course
consists purely of text explanations and an interactive programming
environment to write and run JavaScript. It's not a bad intro to Javascript,
but exercies can get a bit tedious and the lack of video lectures makes it a
bit impersonal, and it doesn't go beyond the basics. The updated course has a
better interactive environment and better exercises than the original; its
good to see Codeacademy improving their content.

A fun and informative course on the deign side of the web, including font
styles and sizes, colors and page layout. Another great offering from Code
School with highly polished materials and exercises.

A brief introduction to JavaScript. The course only covers basic syntax,
values, variables and files. This course should probably have been combined
with part 2 to provide a more comprehensive introduction in one course. Still,
the materials themselves are high quality and it only takes an hour or two to
complete.

Great interactive overview of GIT. Suburb quality level in materials and
exercises. They set you up in a sandbox environment where you are actually
interacting with GIT repositories. Highly recommended for anyone who wants to
learn about how to use GIT.

A decent interactive introduction to Ruby. This seems like it must have been
one of Code School's earlier creations because it doesn't have the same
structure or the amount of polish put into it most of their other classes.
There are no videos: it is an entirely text-based course, much like
Codeacademy courses.

A polished class that builds on the basics of part 1 by introducing Javascript
flow control, functions and arrays. The class has well-made interactive
programming exercises that let you use what you have learned immediately.
Parts 1 and 2 could have been combined into 1 course, since both are so short
and basic.

Another high quality course from Code school. This one covers some
intermediate Javascript topics such as functions, objects, prototypes,
closures, scope and inheritance. The amount of design work and polish Code
School puts into their courses really sets them apart from other MOOC
platforms.

A nice introduction to Chrome dev tools. You should be familiar with HTML, CSS
and Javascript to get the most out of it. It provides a good mix of video
instruction followed by interactive exercises that let you put what you see in
the videos to use. It takes 2-3 hours to complete.

A nice, quick, interactive introduction to R. High quality instruction and
examples. I went through this as some extra background for Coursera's
Computing for Data Analysis class.

A nice quick intro to GIT. I'm not sure why other reviewers rated it so low.
Sure it is very basic but it is a lot more fun than reading a static text
page.

Udacity's Web Development course provides a high quality introduction to back-
end web development with Python using Google App Engine. The course is taught
by Steve Huffman, creator of the Reddit, which gives him many unique insights
about web development and scaling websites. If I were to give this course a
grade just based on the video lectures and quizzes themselves, it would be 5
out of 5, hands down. The video lectures are very well made and quizzes help
reinforce the material without being too difficult. The class covers a wide
range of topics including HTTP requests, basic HTML, getting user input,
databases, user authentication, cookies, caching, scaling and APIs. The
homeworks in this course all have to do with creating and deploying web
applications using Google App Engine, primarily building and adding features
to a blog. The homework, especially when you start building the blog, are a
bit open-ended and probably more complex than the average student would be
able to complete on their own. The lectures don't always provide all the
things you need to know about Google App Engine to complete the assignments.
Another annoying aspect of the homeworks is that Steve uses the Jinja2
templating engine in all his solutions, but he doesn't teach students how to
use it. If you're willing to spend a lot of time doing outside reading (App
Engine docs, Jinja2, etc.) , you might get through the homework on your own,
but in the end I found it more effective to look at Steve's solutions and
study how and why the worked.

This course is an overview of what software testing is and different testing
methods. It focuses mainly on test coverage and random testing and the theory
of testing in general. It doesn't provide much python-specific information
outside of using assert statements to catch problems early. The material is a
bit dry and it would have been nice if it covered python testing methods like
unittest in detail in addition to the language-neutral testing techniques.

Algorithms: Design and Analysis, Part 2 picks up where part 1 left off.
Several of the algorithms and discussions in Part 2 refer back to concepts
discussed in the first part, so it is highly recommended to complete part 1
first. A few of the major topics covered include minimum spanning tree
algorithms, the knapsack problem, dynamic programming, shortest path problems,
the traveling salesman problem, P vs. NP and NP completeness and heuristics
for hard problems. Part 2 is considerably harder than part 1 and the
algorithms you write for homework need to be implemented well to get answers
in a reasonable amount of time and without exceeding your system's memory. It
is possible to complete the class using a high-level language (I used Python)
but you'll probably have to spend a bit more time tweaking your code to get
solutions in a reasonable amount of time. Like part 1, the instruction quality
and assignments are top notch. My biggest gripe with the class is that the
coverage of the P vs. NP question and NP completeness is brief, so students
don’t gain a deep understand of what P vs. NP and NP completeness really mean.
Introduction to theoretical computer science by Udacity provides a much more
through overview of that particular topic. That said, Algorithms: Design and
Analysis, Part 2 is another outstanding offering by Stanford and Coursera.

Algorithms Part 1 is an excellent introduction to the study of algorithm
analysis and design. The course teaches some fundamental principles of
algorithm analysis like big O notation and other important topics in algorithm
design like data structures to represent graphs, the divide and conquer
paradigm, heaps and hash tables. Algorithms discussed include quick sort,
breadth first search, depth first search, finding strongly connected
components of a graph and Dijkstra’s shortest path algorithm. The course
requires the ability to program, but it is language neutral, meaning you can
use whatever language you are most comfortable with to complete the
assignments. The material is fairly dense and the quizzes and programming
assignments are difficult if you haven’t taken a course on algorithms before.
I’d highly recommend this course to anyone that wants to get serious about
going beyond basic programming/scripting and learning some real computer
science.

This codeacademy offering provides a series of text exercises that guide you
through learning the very basics HTML and CSS. This course is short and sweet
and I'd argue of higher overall quality than the Javascript and Python tracks
they offer, which can get long and tedious. It is a good place to start to get
a basic grasp of what HTML and CSS do before taking a beginning course on web
development or trying to make some web pages yourself.

Introduction to theoretical computer science is all about identifying and
tackling hard problems. The quality of the material and instruction is
excellent. Sebastian Wernicke breaks down complex topics in a way that is easy
to understand. Central topics include the P vs. NP question, NP completeness
and strategies for dealing with NP-complete problems. The class uses a few
related graph problems--vertex cover, independent set and clique-- to
introduce and discuss the central topics. It also covers a few other
interesting problems like traveling salesman and 3-SAT. As the name implies,
this course is heavy on theory. As such, there is not a lot of actual
programming you have to do to complete the course. There are a few programming
problems, but quizzes and homework mostly revolve around multiple choice
questions that get you to think about and master the concepts presented in
lecture. Since it is light on programming, the course goes quickly if you
don’t have to re-watch the lectures too many times to understand the material.
Even though this class is about theory, you will learn practical things like
preprocessing data to speed up algorithms. I highly recommend this course to
anyone with curiosity about the P vs. NP question and solving hard problems.

Model thinking looks at the world under many different lenses which can lend
insight into why the world and people work the way they do. This course can be
likened to a college elective: it is fun, the workload isn't too high, the
difficulty is relatively low and the material is interesting.