Big Data & Hadoop
About Big Data & Hadoop
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analysed for insights that lead to better decisions and strategic business moves.
What will you learn???
● What is BIG DATA
● What are the problems of BIG DATA
● HADOOP as a solution
● Different functions of HADOOP
1 MapReduce
2 Pig
3 Hive
4 Sqoop
5 Spark
Course Content
Session 1: Theory & Software Distribution
● Briefly Explanation on BIG DATA
● Problems Of Big Data And The HADOOP As The Solution
● What We Can Do From HADOOP
● Cloudera And VMware Software Distribution
● Installation Of Cloudera Software And VMware Installation
Session 2: Performing some task with Cloudera and introduction of pig
● Introduction to Cloudera
● MapReduce with Eclipse
● Writing Java Script In Eclipse IDE
● Perform WORDCOUNT
● Introduction of PIG
● Wordcount with PIG
Session 3: Introduction of hive, Sqoop
● Introduction of HIVE
● Create table on hive database and Upload data
● What is Sqoop and where to use
● Make fastest way to upload data with Sqoop
Session 4 : Introduction of Spark
● Introduction to Spark
● Briefly explanation on how spark is used in data processing
● perform some action on spark
● final output and more examples