Managing Big Data with R and Hadoop

Provided by:
0/10 stars
based on  0 reviews
Provided by:
Cost FREE , Add a Verified Certificate for $94
Start Date Upcoming

Course Details

Cost

FREE,
Add a Verified Certificate for $94

Upcoming Schedule

  • Upcoming

Course Provider

FutureLearn online courses
At FutureLearn, we want to inspire learning for life. We offer a diverse selection of free, high quality online courses from some of the world’s leading universities and other outstanding cultural institutions. Our aim is to connect learners from all over the globe with high quality educators, and with each other. We believe learning should be an enjoyable, social experience, with plenty of opportunities to discuss what you’ve studied, in order to make fresh discoveries and form new ideas....
At FutureLearn, we want to inspire learning for life. We offer a diverse selection of free, high quality online courses from some of the world’s leading universities and other outstanding cultural institutions. Our aim is to connect learners from all over the globe with high quality educators, and with each other. We believe learning should be an enjoyable, social experience, with plenty of opportunities to discuss what you’ve studied, in order to make fresh discoveries and form new ideas. Courses are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life, rather than your life around learning. We are a private company wholly owned by The Open University, with the benefit of over 40 years of their experience in distance learning and online education. Our partners include over 20 of the best UK and international universities, as well as institutions with a huge archive of cultural and educational material, including the British Council, the British Library, and the British Museum.

Provider Subject Specialization
Humanities
Sciences & Technology
126 reviews

Course Description

##

This online course will introduce you to various high performance computing (HPC) facilities for big data analysis. This includes:

  • R – a programming language renowned for its simplicity, elegance and community support, enriched with packages to work with Hadoop. For preparing and running R scripts RStudio IDE will be used;
  • Hadoop – an open source, Java-based programming framework for large data sets.

For better understanding of Hadoop basic knowledge of bash and awk are needed so we also introduce them briefly.

You will learn via different materials, including hands-on exercises, how to use these tools, avoiding common pitfalls and saving you time and money.

What topics will you cover?

  • First steps in R and RStudio
  • Working with Apache Hadoop 1 – Fundamentals
  • Working with Apache Hadoop 2 – RHadoop
  • Statistical learning using RHadoop

What will you achieve?

By the end of the course, you will:

  • Understand how ...

##

This online course will introduce you to various high performance computing (HPC) facilities for big data analysis. This includes:

  • R – a programming language renowned for its simplicity, elegance and community support, enriched with packages to work with Hadoop. For preparing and running R scripts RStudio IDE will be used;
  • Hadoop – an open source, Java-based programming framework for large data sets.

For better understanding of Hadoop basic knowledge of bash and awk are needed so we also introduce them briefly.

You will learn via different materials, including hands-on exercises, how to use these tools, avoiding common pitfalls and saving you time and money.

What topics will you cover?

  • First steps in R and RStudio
  • Working with Apache Hadoop 1 – Fundamentals
  • Working with Apache Hadoop 2 – RHadoop
  • Statistical learning using RHadoop

What will you achieve?

By the end of the course, you will:

  • Understand how the performance of modern supercomputing is achieved
  • be able to perform basic functionalities within the Bash terminal window;
  • be able to use AWK for basic text processing tasks;
  • Understand the basic functionality of Apache Hadoop for scalable, distributed computing;
  • be able to perform data operations of medium difficulty using R and RHadoop;
  • Understand the basic problems of supervised and unsupervised learning
  • be able to perform clustering, regression and classification methods using RHadoop.

This course is designed for people interested in data science, computational statistics and machine learning. It will also be useful for advanced undergraduate students and first year PhD students in data analysis, statistics or bioinformatics, who wish to understand HPC.

We expect that the followers of the course have basic experiences with linux, bash and R and are capable to download and run virtual machine.

All software needed to actively participate the course is provided within the virtual machine that the followers are supposed to download and run on the local machine. No extra software is needed. You will need a modest local machine with 15GB free disk space and 2GB RAM.

Reviews 0/10 stars
0 Reviews for Managing Big Data with R and Hadoop

Ratings details

  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.

No reviews yet. Be the first!

Rating Details


  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars
  • 5 stars
  • 4 stars
  • 3 stars
  • 2 stars
  • 1 stars

Rankings are based on a provider's overall CourseTalk score, which takes into account both average rating and number of ratings. Stars round to the nearest half.