HADOOP ONLINE TRAINING COURSE IN INDIA

UNIT 1: INTRODUCTION AND OVERVIEW OF HADOOP

    • What is Hadoop?
    • History of Hadoop.
    • Building
    • Blocks – Hadoop Eco-System.
    • Who is behind Hadoop?
    • What Hadoop is good for and what it is not?

UNIT 2: HADOOP DISTRIBUTED FILESYSTEM (HDFS)

    • HDFS Overview and Architecture
    • PREVIEWHDFS Installation
    • HDFS Use Cases
    • Hadoop File System Shell
    • File System Java APIHadoop Configuration

UNIT 3: HBASE – THE HADOOP DATABASE

  • HBase Overview and Architecture
  • HBase Installation
  • HBase Shell
  • Java Client
  • APIJava Administrative
  • APIFilters
  • Scan Caching and Batching
  • Key
  • Design
  • Table Design

UNIT 4: MAP/REDUCE 2.0/YARN

  • Decomposing Problems into MapReduce Workflow
  • Using JobControlOozie Introduction and Architecture
  • Oozie Installation
  • Developing, deploying, and Executing Oozie Workflows

UNIT 5: PIG

  • Pig Overview
  • Installation
  • Pig Latin
  • Developing Pig Scripts
  • Processing Big Data with Pig
  • Joining data-sets with Pig

UNIT 6: HIVE

  • Hive Overview
  • Installation
  • Hive QL

UNIT 7: SQOOP

  • Introduction
  • Sqoop Tools
  • Sqoop Import
  • Sqoop Import all tables
  • Sqoop Export
  • Sqoop Job
  • Sqoop metastore
  • Sqoop Eval
  • Sqoop Codegen
  • Sqoop List Databases and List Tables
  • Sqoop Create Hive Table

ADVANCE COURSE CONTENT: UNIT 1: INTEGRATING HADOOP INTO THE WORKFLOW

  • Relational Database Management Systems
  • Storage Systems
  • Importing Data from RDBMSs With Sqoop
  • Hands-on exercise
  • Importing Real-Time Data with Flume
  • Accessing HDFS Using FuseDFS and Hoop

UNIT 2: DELVING DEEPER INTO THE HADOOP API

  • More about ToolRunner
  • Testing with MRUnit
  • Reducing Intermediate Data With Combiners
  • The configure and close methods for Map/Reduce Setup and TeardownW
  • ritingPartitioners for Better Load Balancing
  • Hands-On Exercise
  • Directly Accessing HDFS
  • Using the Distributed Cache

UNIT 3: COMMON MAP REDUCE ALGORITHMS

  • Sorting and Searching
  • Indexing
  • Machine Learning With Mahout
  • Term Frequency – Inverse Document Frequency
  • Word Co-Occurrence

UNIT 4: USING HIVE AND PIG

  • Hive Basics
  • Pig Basics

UNIT 5: PRACTICAL DEVELOPMENT TIPS AND TECHNIQUES

  • Debugging MapReduce Code
  • Using LocalJobRunner Mode For Easier Debugging
  • Retrieving Job Information with Counters
  • LoggingSplittable File Formats
  • Determining the Optimal Number of Reducers
  • Map-Only MapReduce Jobs

UNIT 6: MORE ADVANCED MAP REDUCE PROGRAMMING

  • Custom Writables and WritableComparables
  • Saving Binary Data using SequenceFiles and Avro Files
  • Creating InputFormats and OutputFormats

UNIT 7: JOINING DATA SETS IN MAP REDUCE

  • Map-Side Joins
  • The Secondary Sort
  • Reduce-Side Joins

UNIT 8: GRAPH MANIPULATION IN HADOOP

  • Introduction to graph techniques
  • Representing graphs in Hadoop
  • Implementing a sample algorithm: Single Source Shortest Path

UNIT 9: CREATING WORKFLOWS WITH OOZIE

  • The Motivation for Oozie
  • Oozie’s Workflow Definition Format

CONTACT US