- 2 reviews
- 2 completed
I have completed the updated version recently. It is a good introduction to Hadoop distributed computing system useful for big data processing. The course is quite short but also concentrated. There are only four lessons but the installation, implementation and programming that one has to do for the assignments take much more hours. One of the most valuable features of the course I found the opportunity to download and install Cloudera Distribution of Hadoop on a virtual machine (also included) on my computer and see Hadoop in action. All the programing in done in Python and command line.
As an intro to Data Science it was a good course. During the course we were given a principal overview of the emerging discipline of data science. The thing I liked about the course most was that the Python packages that were introduced and used for the assignments are really some of the most progressive ones available. Especially Pandas for data manipulation and ggplot for visualization. Overall the course provides good guidelines for further exploration of data science.