Introduction to Apache Spark

Provided by:
9/10 stars
based on  24 reviews
Provided by:
Cost FREE
Start Date TBA

Course Details

Cost

FREE

Upcoming Schedule

  • TBA

Course Provider

edX online courses
Harvard University, the Massachusetts Institute of Technology, and the University of California, Berkeley, are just some of the schools that you have at your fingertips with edX. Through massive open online courses (MOOCs) from the world's best universities, you can develop your knowledge in literature, math, history, food and nutrition, and more. These online classes are taught by highly-regarded experts in the field. If you take a class on computer science through Harvard, you may be tau...
Harvard University, the Massachusetts Institute of Technology, and the University of California, Berkeley, are just some of the schools that you have at your fingertips with edX. Through massive open online courses (MOOCs) from the world's best universities, you can develop your knowledge in literature, math, history, food and nutrition, and more. These online classes are taught by highly-regarded experts in the field. If you take a class on computer science through Harvard, you may be taught by David J. Malan, a senior lecturer on computer science at Harvard University for the School of Engineering and Applied Sciences. But there's not just one professor - you have access to the entire teaching staff, allowing you to receive feedback on assignments straight from the experts. Pursue a Verified Certificate to document your achievements and use your coursework for job and school applications, promotions, and more. EdX also works with top universities to conduct research, allowing them to learn more about learning. Using their findings, edX is able to provide students with the best and most effective courses, constantly enhancing the student experience.

Provider Subject Specialization
Sciences & Technology
Business & Management
18760 reviews

Course Description

Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued.

This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the abilit...

Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued.

This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini-quiz before the course and take this Python mini-course if they need to learn Python or refresh their Python knowledge.

Introduction to Apache Spark course image
Reviews 9/10 stars
24 Reviews for Introduction to Apache Spark

Ratings details

  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.

Sort By
Student profile image
Student profile image

Student

10/10 starsCompleted
1 year, 3 months ago
I would just like to thank the creators from the bottom of my heart. This is a great course. I see some comments that it should also teach us how to deploy our own clusters etc. I do not agree. I think it is much more difficult to learn how to properly analyze data than learning technical stuff. Actually, Spark is very easy to set up and run, and if you ever get paid good money for something, it will not be setting up the system, but programming and analyzing. This is similar to setting up Linux: it will not earn you money, but using it will. Once again, GREAT COURSE! Looking forward to the next installment(s?).
Was this review helpful? Yes1
 Flag
Rishabh khurana profile image
Rishabh khurana profile image

Rishabh khurana

10/10 starsTaking Now
1 month, 4 weeks ago
when are we starting this course next? is there a way i can access the material otherwise, since i cant login to it.
Was this review helpful? Yes0
 Flag
Raphael Radowitz profile image
Raphael Radowitz profile image

Raphael Radowitz

10/10 starsCompleted
8 months, 3 weeks ago
Really great introduction into Spark and helpful community. Great and understandable slides and really good labs building up upon each other. Only the AutoGrader and the different "Logins" were confusing at start. Nonetheless absolutely recommendable!
Was this review helpful? Yes0
 Flag
Erik Kringen profile image
Erik Kringen profile image

Erik Kringen

10/10 starsCompleted
1 year, 2 months ago
The labs were great! The notebook format was great since the code was broken up into sections and allowed explanations/tests/visualizations as we molded our datasets.
Was this review helpful? Yes0
 Flag
Kunal Ghosh profile image
Kunal Ghosh profile image

Kunal Ghosh

10/10 starsCompleted
1 year, 2 months ago
I loved the course databricks presented by Joseph. I scored 99% hence cannot complain. I am bit upset with EDX as I am not being able to my verified certificate as they a deadline for verified certificate request which is 10 days earlier that the course completion deadline. This is absurd hence a 3 star rating the provider "EDX".
Was this review helpful? Yes0
 Flag
Waj Nas profile image
Waj Nas profile image

Waj Nas

10/10 starsCompleted
1 year, 2 months ago
Very impressed! Certainly didn't feel like beginner level doing lab2 exercises. The Databricks IDE is very helpful as well
Was this review helpful? Yes0
 Flag
Daniel D profile image
Daniel D profile image

Daniel D

10/10 starsTaking Now
1 year, 2 months ago
This course as well as the other courses belonging to the XSeries in Data Science and Engineering are among the best MOOCs in this field. Each of these courses have a very clear focus, the didactical concept is great, the explanations and examples in the lectures are well chosen and understandable. The programming homework are extremely well designed. Each lab exercise covers an interesting, up-to-date topic of practical relevance. I want to express my gratitude to the instructors and the edX-team for providing this great course.
Was this review helpful? Yes0
 Flag
student profile image
student profile image

student

10/10 starsCompleted
1 year, 2 months ago
Thank You for giving the opportunity to learn pyspark. Course was good and I enjoyed the lab exercises.
Was this review helpful? Yes0
 Flag
Sudip Chahal profile image
Sudip Chahal profile image

Sudip Chahal

10/10 starsCompleted
1 year, 2 months ago
I'd like to thank and congratulate professor Joseph and team for a simply outstanding class - all the major points are made very clearly and the labs are beautifully designed. The care and attention to detail that has gone into this class is simply outstanding and sets the standards for everyone else. Thanks, again.
Was this review helpful? Yes0
 Flag
student profile image
student profile image

student

6/10 starsCompleted
1 year, 3 months ago
This course gives you a basic beginning about Spark. Good place to get started, once you finish this please start reading the other advanced courses.
Was this review helpful? Yes0
 Flag
灵 金 profile image
灵 金 profile image
8/10 starsCompleted
  • 1 review
  • 1 completed
1 year, 4 months ago
This course is very interesting, I have learned a lot of fundamental concepts of Spark.The assignments are also very interesting. This course is worthy to be enrolled.
Was this review helpful? Yes0
 Flag
Student profile image
Student profile image

Student

4/10 starsCompleted
1 year, 4 months ago
Introduction to Apache spark provided an overview of using SQL akin function and manipulation to datasets using PySpark. Distributed execution of the examples happened under the hood with less appreciation of the distributed architecture. It would be more worthy to have examples which express the gain in execution speed via usage of multiple cores, which essentially is the driving force to use Apache Spark. Looking forward for these in further courses of XSeries.
Was this review helpful? Yes0
 Flag
Ravan Nannapaneni profile image
Ravan Nannapaneni profile image

Ravan Nannapaneni

10/10 starsCompleted
1 year, 4 months ago
The course is simple and gives a perfect introduction to Apache Spark. I understood the workings of Apache Spark and this will help me efficiently design Spark programs. Recommend for anyone starting the Spark journey.
Was this review helpful? Yes0
 Flag
student profile image
student profile image

student

2/10 starsCompleted
1 year, 4 months ago
The content of this course is very poorly designed. The whole course only consists of three lectures and each lecture has at most 30 minutes of video content which well below average time one finds in most rigorous courses. Even worse, the video lectures are very disconnected from each other and at the end, they do not really provide a clear vision of Spark architecture and how the user interacts with it via Python. The course is also highly priced for the verified version. One can simply compare this course with some other courses that are offered for instance on Coursera to see the topics are covered at much more length while the price is lower. The only relatively positive side of this course is the assignments. Although even the assignment can be improved significantly.
Was this review helpful? Yes0
 Flag
Darrell Ulm profile image
Darrell Ulm profile image
9/10 starsCompleted
  • 1 review
  • 1 completed
1 year, 4 months ago
Very nice introduction to Apache Spark using Python (PySpark), for those with some background in computer science or software development. Lectures are interesting, relevant and increase interest in learning more about Spark and Big Data processing. The labs are a good introduction to programming with the Spark API, although a larger project could help further learning if the class length is extended in the future, or for other courses in Apache Spark.
Was this review helpful? Yes0
 Flag
Ope Okesola profile image
Ope Okesola profile image

Ope Okesola

8/10 starsTaking Now
1 year, 5 months ago
It would provide insight required for the other series, I know it would be very educative. Anthony D. Joseph has proved to be a very great teacher.
Was this review helpful? Yes1
 Flag
Syed Hussain profile image
Syed Hussain profile image

Syed Hussain

10/10 starsTaking Now
1 year, 5 months ago
Good to see the course content, waiting for the course to start. Good to see the course content, waiting for the course to start. Good to see the course content, waiting for the course to start. Thanks - Syed.
Was this review helpful? Yes1
 Flag
Anjaneya Vadlamani profile image
Anjaneya Vadlamani profile image

Anjaneya Vadlamani

8/10 starsTaking Now
1 year, 5 months ago
Good timing to pursue the course. Anthony is a good professor. I got to listen to couple of his other courses on edX. Thanks a ton and Good luck to everyone.
Was this review helpful? Yes1
 Flag
Aakash Moghariya profile image
Aakash Moghariya profile image

Aakash Moghariya

10/10 starsCompleted
1 year, 4 months ago
This is very good introductory course on Spark. Apart from that this course can also be classified as good exploratory data analysis course. Both lab involved data cleaning, filtering, transforming and then exploring the data. Provides good documentation and tutorials to get stated with Spark. Pumped me up to take upcoming courses, and really with data science as a whole. - Amazing Labs - Extremely well organized lectures Cons: - Try to make Autograder process easy. If possible. First submission and setting up process is tedious. But rest assured is very good. Suggestions: - Please add more project. Need more experience with Hands On. Project should not guide us through the entire process after 2 labs, just give us a sample dataset to explore and create a notebook and submit. Let other students grade fellow student's work. This would be pretty good addition if happens in the course.
Was this review helpful? Yes0
 Flag
student profile image
student profile image

student

10/10 starsCompleted
1 year, 4 months ago
Good course. Easy to understand lectures. High quality exercises. Probably would like to see more exercises even though there are a lot of them present :) Thanks for the course !
Was this review helpful? Yes0
 Flag
asif uddin ahmad profile image
asif uddin ahmad profile image

asif uddin ahmad

8/10 starsTaking Now
1 year, 5 months ago
yes there course was really helpful but i will be more helpful if there any video record of the class.
Was this review helpful? Yes0
 Flag
Naveen Raju Arvaraju profile image
Naveen Raju Arvaraju profile image

Naveen Raju Arvaraju

9/10 starsTaking Now
1 year, 5 months ago
Thanks a bunch! waiting for the course really appreciate all your efforts.You are the best!!.Thanks a lot!!
Was this review helpful? Yes0
 Flag
andrew warholl profile image
andrew warholl profile image

andrew warholl

10/10 starsDropped
1 year, 7 months ago
They are very good! I am very glad of I completed the MOOC. I reccomend the course to al the people that are interested on the subject.
Was this review helpful? Yes0
 Flag
Pradeep Kumar Kuruva Burujula profile image
Pradeep Kumar Kuruva Burujula profile image

Pradeep Kumar Kuruva Burujula

9/10 starsTaking Now
1 year, 8 months ago
This one of the latest emerging technologies. I think this is a good course. It may help me a lot in establishment of my career. Thanks.
Was this review helpful? Yes0
 Flag

Rating Details


  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.