About Hadoop Training
The Big Data training and the Hadoop training courses offered at Newgen Infotech have a huge demand in the job market today. Being one of the best Big Data Hadoop training institutes, Newgen Infotech enables you to explore a unique way of learning new skills with the professional training approach. Going through the well designed, industry aligned Big Data courses certainly trains students thoroughly for the highly competitive industry. A training in Big Data and Analytics can lead to better job opportunities. To learn the latest in this technology, join Newgen Infotech today.
Program Benefits & Highlights
- Learn & interact with renowned Industry experts
- Receive an unparalleled education on the art of computer security with personal one-on-one attention
- Hadoop E-Book will also be provided to the students during the workshop
- Practical Demonstration of Hadoop will be covered during the workshop
- Hands- On practical will be included in the workshop
Hadoop Training Objective
The current wave of “Big Data” has tremendous opportunities. The deluge of big data is likely to persist in the future. Tools to handle big data will eventually become main stream and common place, which is when almost everyone is working with big data and our workshop focuses on the participate to learn about hadoop
Hadoop Training Duration
- Regular classroom based training available for this course.
- Fast track (1-1): Can be arranged as per the convenience of participant.
- 45 Working days, daily one and half hours
1. Big-Data and Hadoop
1.1. Introduction to big data and Hadoop
1.2. Hadoop Architecture
1.3. Installing Ubuntu with Java 1.8 on VM Workstation 11
1.4. Hadoop Versioning and Configuration
1.5. Single Node Hadoop 1.2.1 installation on Ubuntu 14.4.1
1.6. Multi Node Hadoop 1.2.1 installation on Ubuntu 14.4.1
1.7. Linux commands and Hadoop commands
1.8. Cluster architecture and block placement
1.9. Modes in Hadoop
1.9.1. Local Mode
1.9.2. Pseudo Distributed Mode
1.9.3. Fully Distributed Mode
1.10. Hadoop Daemon
1.10.1. Master Daemons(Name Node, Secondary Name Node, Job Tracker)
1.10.2. Slave Daemons(Job tracker, Task tracker)
1.11. Task Instance
1.12. Hadoop HDFS Commands
1.13. Accessing HDFS
1.13.1. CLI Approach
1.13.2. Java Approach
2. Map-Reduce
2.1. Understanding Map Reduce Framework
2.2. Inspiration to Word-Count Example
2.3. Developing Map-Reduce Program using Eclipse Luna
2.4. HDFS Read-Write Process
2.5. Map-Reduce Life Cycle Method
2.6. Serialization(Java)
2.7. Datatypes
2.8. Comparator and Comparable(Java)
2.9. Custom Output File
2.10. Analysing Temperature dataset using Map-Reduce
2.11. Custom Partitioner & Combiner
2.12. Running Map-Reduce in Local and Pseudo Distributed Mode.
3. Advanced Map-Reduce
3.1. Enum(Java)
3.2. Custom and Dynamic Counters
3.3. Running Map-Reduce in Multi-node Hadoop Cluster
3.4. Custom Writable
3.5. Site Data Distribution
3.5.1. Using Configuration
3.5.2. Using DistributedCache
3.5.3. Using stringifie
3.6. Input Formatters
3.6.1. NLine Input Formatter
3.6.2. XML Input Formatter
3.7. Sorting
3.7.1. Primary Reverse Sorting
3.7.2. Secondary Sorting
3.8. Compression Technique
3.9. Working with Sequence File Format
3.10. Working with AVRO File Format
3.11. Testing MapReduce with MR Unit
3.12. Working with NYSE DataSets
3.13. Working with Million Song DataSets
3.14. Running Map-Reduce in Cloudera Box
4. HIVE
4.1. Hive Introduction & Installation
4.2. Data Types in Hive
4.3. Commands in Hive
4.4. Exploring Internal and External Table
4.5. Partitions
4.6. Complex data types
4.7. UDF in Hive
4.7.1. Built-in UDF
4.7.2. Custom UDF
4.8. Thrift Server
4.9. Java to Hive Connection
4.10. Joins in Hive
4.11. Working with HWI
4.12. Bucket Map-side Join
4.13. More commands
4.13.1. View
4.13.2. SortBy
4.13.3. Distribute By
4.13.4. Lateral View
4.14. Running Hive in Cloudera
5. SQOOP
5.1. Sqoop Installations and Basics
5.2. Importing Data from Oracle to HDFS
5.3. Advance Imports
5.4. Real Time UseCase
5.5. Exporting Data from HDFS to Oracle
5.6. Running Sqoop in Cloudera
6. PIG
6.1. Installation and Introduction
6.2. WordCount in Pig
6.3. NYSE in Pig
6.4. Working With Complex Datatypes
6.5. Pig Schema
6.6. Miscellaneous Command
6.6.1. Group
6.6.2. Filter
6.6.3. Order
6.6.4. Distinct
6.6.5. Join
6.6.6. Flatten
6.6.7. Co-group
6.6.8. Union
6.6.9. Illustrate
6.6.10. Explain
6.7. UDFs in Pig
6.8. Parameter Substitution and DryRun
6.9. Pig Macros
6.10. Running Pig in Cloudera
7. HBase
7.1. HBase Introduction & Installation
7.2. Exploring HBase Shell
7.3. HBase Storage Techinique
7.4. HBasing with Java
7.5. CRUD with HBase
7.6. Hive HBase Integration
8. OOZIE
8.1. Installing Oozie
8.2. Running Map-Reduce with Oozie
8.3. Running Pig and Sqoop with Oozie