Analyzing Big Data with Microsoft R Server

Provided by:
8/10 stars
based on  16 reviews
Provided by:
Cost FREE , Add a Verified Certificate for $49
Start Date In Session

Course Details

Cost

FREE,
Add a Verified Certificate for $49

Upcoming Schedule

  • In Session

Course Provider

edX online courses
Harvard University, the Massachusetts Institute of Technology, and the University of California, Berkeley, are just some of the schools that you have at your fingertips with edX. Through massive open online courses (MOOCs) from the world's best universities, you can develop your knowledge in literature, math, history, food and nutrition, and more. These online classes are taught by highly-regarded experts in the field. If you take a class on computer science through Harvard, you may be tau...
Harvard University, the Massachusetts Institute of Technology, and the University of California, Berkeley, are just some of the schools that you have at your fingertips with edX. Through massive open online courses (MOOCs) from the world's best universities, you can develop your knowledge in literature, math, history, food and nutrition, and more. These online classes are taught by highly-regarded experts in the field. If you take a class on computer science through Harvard, you may be taught by David J. Malan, a senior lecturer on computer science at Harvard University for the School of Engineering and Applied Sciences. But there's not just one professor - you have access to the entire teaching staff, allowing you to receive feedback on assignments straight from the experts. Pursue a Verified Certificate to document your achievements and use your coursework for job and school applications, promotions, and more. EdX also works with top universities to conduct research, allowing them to learn more about learning. Using their findings, edX is able to provide students with the best and most effective courses, constantly enhancing the student experience.

Provider Subject Specialization
Sciences & Technology
Business & Management
23470 reviews

Course Description

This course is part of the Microsoft Professional Program Certificate in Big Data, and the Microsoft Professional Program Certificate in Data Science

The open-source programming language R has for a long time been popular (particularly in academia) for data processing and statistical analysis. Among R's strengths are that it's a succinct programming language and has an extensive repository of third party libraries for performing all kinds of analyses. Together, these two features make it possible for a data scientist to very quickly go from raw data to summaries, charts, and even full-blown reports. However, one deficiency with R is that traditionally it uses a lot of memory, both because it needs to load a copy of the data in its entirety as a data.frame object, and also because processing the data often involves making further copies (sometimes referred to as copy-on-modify). This is one of the reasons R has been more rel...

This course is part of the Microsoft Professional Program Certificate in Big Data, and the Microsoft Professional Program Certificate in Data Science

The open-source programming language R has for a long time been popular (particularly in academia) for data processing and statistical analysis. Among R's strengths are that it's a succinct programming language and has an extensive repository of third party libraries for performing all kinds of analyses. Together, these two features make it possible for a data scientist to very quickly go from raw data to summaries, charts, and even full-blown reports. However, one deficiency with R is that traditionally it uses a lot of memory, both because it needs to load a copy of the data in its entirety as a data.frame object, and also because processing the data often involves making further copies (sometimes referred to as copy-on-modify). This is one of the reasons R has been more reluctantly received by industry compared to academia.

The main component of Microsoft R Server (MRS) is the RevoScaleR package, which is an R library that offers a set of functionalities for processing large datasets without having to load them all at once in the memory. RevoScaleR offers a rich set of distributed statistical and machine learning algorithms, which get added to over time. Finally, RevoScaleR also offers a mechanism by which we can take code that we developed on our laptop and deploy it on a remote server such as SQL Server or Spark (where the infrastructure is very different under the hood), with minimal effort.

In this course, we will show you how to use MRS to run an analysis on a large dataset and provide some examples of how to deploy it on a Spark cluster or a SQL Server database. Upon completion, you will know how to use R for big-data problems.

Since RevoScaleR is an R package, we assume that the course participants are familiar with R. A solid understanding of R data structures (vectors, matrices, lists, data frames, environments) is required. For example, students should be able to confidently tell the difference between a list and a data frame, or what each object is generally a good representation for and how to subset it. Students should be familiar with basic programming concepts such as control flows, loops, functions and scope. Students should have a good understanding of how to write and debug R functions. Finally, students are expected to have a good understanding of data manipulation and data processing in R (e.g. functions such as merge, transform, subset, cbind, rbind, lapply, apply). Familiarity with 3rd party packages such as dplyr is also helpful.

Reviews 8/10 stars
16 Reviews for Analyzing Big Data with Microsoft R Server

Ratings details

  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.

Sort By
T N profile image
T N profile image

T N

2/10 starsCompleted
9 months, 1 week ago
I agree with some of the commenters that this course is an advertising for RevoscaleR package. The instructor was extremely boring. I only got through half of the class and gave up on the lectures and start reading the documentations of RevoscaleR library from microsoft. I was able to do all the quizzes and lab exercises with overall grade of 92. So I am glad I didn’t wasted my time listening to lectures.
Was this review helpful? Yes0
 Flag
John K profile image
John K profile image

John K

2/10 starsCompleted
1 year, 3 months ago
The course on Edx was not very good. 1.The video player would not work properly and I needed to watch the videos on youtube, otherwise the pace was mind-numbingly slow. This is a small issue, and the least of the problems. 2. The organization of course material with the quizes was out of alignment. As if there had been a previous version of this course and updates were not coordinated. 3. The quiz questions were often vague and depended heavily on finding answers through google, reading the material extremely carefully. This wouldn't be TOO big a deal except you only get a single chance to answer the question correctly. Read questions/ material VERY carefully. 4. The material is out of date with new updates to software/releases. On a number of occasions I had to spend the whole night just updating and finding workarounds so I could complete the examples shown and follow along. 5. RevoScaleR has a LOT of idiosync... The course on Edx was not very good. 1.The video player would not work properly and I needed to watch the videos on youtube, otherwise the pace was mind-numbingly slow. This is a small issue, and the least of the problems. 2. The organization of course material with the quizes was out of alignment. As if there had been a previous version of this course and updates were not coordinated. 3. The quiz questions were often vague and depended heavily on finding answers through google, reading the material extremely carefully. This wouldn't be TOO big a deal except you only get a single chance to answer the question correctly. Read questions/ material VERY carefully. 4. The material is out of date with new updates to software/releases. On a number of occasions I had to spend the whole night just updating and finding workarounds so I could complete the examples shown and follow along. 5. RevoScaleR has a LOT of idiosyncrasies in how it functions. there is a learning curve, and interacting with the objects are not intuitive. However, after having completed the course it is not without its benefits, which they cover. Mostly my problem centers around the course itself and not so much the package. Beware taking the course.
Was this review helpful? Yes0
 Flag
Mohammad Hossein profile image
Mohammad Hossein profile image

Mohammad Hossein

10/10 starsCompleted
1 year, 4 months ago
Great instructor (Seth)! Many questions came to my mind during the course that he answered them a few minutes later. Great course to take if you would like to understand Microsoft R. Do not expect learning "analyzing" big data with this course, instead expect to learn analyzing big data with "Microsoft R Server". So, the intention of the course is the tool part, not teaching you how to analyze data, ML algorithms, etc. I wish he was the only instructor, the last section for deployment got a little lost by the two other instructors.
Was this review helpful? Yes0
 Flag
Ashok profile image
Ashok profile image

Ashok

8/10 starsCompleted
1 year, 11 months ago
Course could have been more interesting if simple examples have been used. Also the Visual Studio is really pain, it took me almost one week to really install and learn the interface. Instructor is good but need more enthusiasm while delivering the content to maintain learners interest.
Was this review helpful? Yes0
 Flag
Prem Kumae profile image
Prem Kumae profile image

Prem Kumae

4/10 starsCompleted
2 years ago
Useless. Poor quality. Microsoft trying to boast their MS R package. Totally disorganized. nothing to do with Data analysis,. Horrible. Waste of time and money.
Was this review helpful? Yes0
 Flag
 profile image
 profile image

6/10 starsCompleted
  • 1 review
  • 1 completed
2 years ago
Most video's are looking at the instructor doing an example in R Server. Would have been nice to do more hands on than only watch somebody else do it. Also the cost that is listed here for the certificate is wrong you pay $99 and not $49.
Was this review helpful? Yes0
 Flag
Fawaz Ahmed profile image
Fawaz Ahmed profile image

Fawaz Ahmed

10/10 starsCompleted
2 years, 2 months ago
Really enjoyed the course, content, Instructor and everything is perfect. The only thing I hate is Microsoft trying to promote it's product's through this course
Was this review helpful? Yes0
 Flag
Dag Tveit profile image
Dag Tveit profile image
10/10 starsCompleted
  • 0 reviews
  • 0 completed
2 years, 5 months ago
Very Nice course. very nice way of earning skills, love the new certification model. and i hope this will be more popular in the future. The questions could have some clarifications. though its my fault for not reading ahead
Was this review helpful? Yes0
 Flag
C Shah profile image
C Shah profile image

C Shah

6/10 starsCompleted
2 years, 8 months ago
I had very high expectations from this course, especially as it is being offered by Microsoft. It was very difficult set up GUI environment using Microsoft Visual Studio. I ended up using RStudio. I had to really force myself to listen to the instructor, he lacked passion and I found it difficult to pay attention. Fact is that Data Science is a very tough subject and the instructor must be engaging. For example, have a look at how Andrew Ng delivers his courses and they are enjoyable.
Was this review helpful? Yes0
 Flag
Semyon Semyonov profile image
Semyon Semyonov profile image

Semyon Semyonov

8/10 starsCompleted
2 years, 8 months ago
The instructor and the presentations are really nice, but the course itself is quite sketchy. I expected many more practical assessments i.e Labs, a wider topic coverage (the couse almost didn't cover the most interesting parts of interaction with Hadoop/SQL..) and the deeper insights of distributed R practices.
Was this review helpful? Yes0
 Flag
Srikanth Potukuchi profile image
Srikanth Potukuchi profile image
8/10 starsCompleted
  • 1 review
  • 1 completed
2 years, 10 months ago
I can't really comment on the instructor as mostly I just read the content which was clear and concise. I relied on the documentation page for more clarity of syntax. A very good introduction to RevoScaleR package. Most of the course focussed on importing the data to .xdf and then working with it. I would have loved more examples working with Hadoop (When I hear Big data that's what comes to my mind!) and more importantly access to a virtual machine for using compute context(Hadoop,Spark etc).
Was this review helpful? Yes0
 Flag
David Spriggs profile image
David Spriggs profile image

David Spriggs

10/10 starsCompleted
2 years, 10 months ago
Great presentation and pacing. Familiarity with R is crucial to success in this course. However, the instructor did a great job in presenting RevoScaleR and its benefits.
Was this review helpful? Yes0
 Flag
Kevin Queen profile image
Kevin Queen profile image

Kevin Queen

10/10 starsCompleted
2 years, 11 months ago
Good course for learning how to use R Server. RevoScaler is a very different from tidyverse for example and course does a good job of helping you understand. Be prepared to try and test things out, and to research mainly with the built in r help function to learn more details. Test questions are not always clear and could use some tweaking but most are fine.
Was this review helpful? Yes0
 Flag
Kay Apperson profile image
Kay Apperson profile image

Kay Apperson

10/10 starsCompleted
3 years ago
Full disclosure. I'm a Microsoft employee. Another disclosure, I've spent a lot of my adult life with open source technologies. It's a good fairly difficult class for someone who's used R open source for years. I learned a lot from this class. I agree that one'd have to have the software to achieve what the course is about, but I don't think it's trying to sell the software. Instead, I think people who're already sold on the idea of using RevoR/Microsoft R Server are hungry to learn how to fully utilize it. And, this course delivers that. If you haven't been hitting the limit of open source R yet, I get it that maybe it’s harder for you to understand why this course is right for you. I'd used R for the past several years before I joined Microsoft. I used R in predictive analytics (to improve some lives and to save some company some money), and I used R for straight statistical analysis/hypothesis testing, too. The most time sp... Full disclosure. I'm a Microsoft employee. Another disclosure, I've spent a lot of my adult life with open source technologies. It's a good fairly difficult class for someone who's used R open source for years. I learned a lot from this class. I agree that one'd have to have the software to achieve what the course is about, but I don't think it's trying to sell the software. Instead, I think people who're already sold on the idea of using RevoR/Microsoft R Server are hungry to learn how to fully utilize it. And, this course delivers that. If you haven't been hitting the limit of open source R yet, I get it that maybe it’s harder for you to understand why this course is right for you. I'd used R for the past several years before I joined Microsoft. I used R in predictive analytics (to improve some lives and to save some company some money), and I used R for straight statistical analysis/hypothesis testing, too. The most time spent was to "manage" my data, so R would run within the compute and memory limit. I could never do a parameter-grid sweeping because it wasn't ever possible in the amount of memory my server had. With Revolution Analytics R (now Microsoft R Server), it provided a free doMC package that allowed me to utilize more than 1 core of my machine, given that my organization during the time wasn't a Linux shop. Still quite pity because doMC wasn't close to enough. All the while I wished I could use RevoR. The true RevoScaleR (not free) has the ability to go beyond the physical memory limit. It allows datasets to be virtually as large as they are because of the chunk-by-chunk processing. And, how wonderful it is that we the data scientists do not have to divide-and-conquer the problem ourselves but just call a function! Think of how you would subdivide your data into chunks, build a random forest for each chunk (and want to sweep the grid, too!), and combine all of results to achieve the final solution as if you had run the algorithm with the entire data in the first place. I'm for one do not want to do that "manually".
Was this review helpful? Yes0
 Flag
John Sharp profile image
John Sharp profile image

John Sharp

10/10 starsTaking Now
3 years, 4 months ago
As long as you understand the requirements and expectations, this is a very good course. Sure, it requires a lot of self-study as well, but that is the best way to learn. Much better than being spoonfed all the time.
Was this review helpful? Yes0
 Flag
Jason Kevin profile image
Jason Kevin profile image

Jason Kevin

1/10 starsTaking Now
3 years, 4 months ago
Totally disorganized. nothing to do with Data analysis, advertisement of MS R package. Horrible. Waste of time and money.
Was this review helpful? Yes0
 Flag

Rating Details


  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.