Hortonworks (HDP) Hadoop Administration Overview

This Hortonworks (HDP) Hadoop Administration training program provides online training on the popular skills required for a successful career in Hadoop administration data engineering. Master the art of creating and managing Hadoop cluster using HDP management console called "Ambari". Also learn how to configure High availability of Namenode and ResourceManager, how to run Balancer tool, how to take up the back up and snapshot , how to create secure cluster using kerbros and Ranger. 

Hortonworks (HDP) Hadoop Administration Key Features

  • 60 hours of blended learning
  • Includes real industry-based project
  • Includes three assignment based exams to test Hadoop Administration skills
  • Lifetime access to self-paced learning
  • Dedicated mentoring session from industry experts

Skills Covered

  • Hadoop and HDP administration skill
  • Hadoop architecture and its eco-system tools
  • Hive, HBase and Spark Administration

Hortonworks (HDP) Hadoop Administration Curriculum

Backup, Recovery and Maintenance

Learning Objectives - In this module, you will understand day to day Cluster Administration tasks such as adding and Removing Data Nodes, NameNode recovery, configuring Backup and Recovery in Hadoop, Diagnosing the Node Failures in the Cluster, Hadoop Upgrade etc.

Topics -Configure Rack awareness, Setting up Hadoop Backup, whitelist and blacklist data nodes in a cluster, setup quota's, upgrade Hadoop cluster, copy data across clusters using distcp, Diagnostics and Recovery, Cluster Maintenance

Advanced Topics: QJM, HDFS Federation and Security

Learning Objectives - In this module, you will understand basics of Hadoop security, Managing security with Kerberos, HDFS Federation setup and Log Management. You will also understand HDFS High Availability using Quorum Journal Manager (QJM).

Topics -Configuring HDFS Federation, Basics of Hadoop Platform Security, Securing the Platform, Configuring Kerberos

Kafka Administration

  • Basic Kafka Concepts
  • Kafka vs Other Messaging Systems
  • Intra-Cluster Replication
  • An Inside Look at Kafka’s Components
  • Log Administration, Retention, and Compaction
  • Hardware and Runtime Configurations
  • Monitoring and Alerting
  • Cluster Administration
  • Securing Kafka
  • Using Kafka Connect to Move Data

Data Loading

Data Loading : Here we will learn different data loading options available in Hadoop and will look into details about Flume and Sqoop to demonstrate how to bring various kind of files such as Web server logs , stream data, RDBMS,  twetter ‘s tweet into HDFS.

YARN Queue Managment

Configuration management

YARN Architecture

YARN Node Configuration

Memory Consumption parameters

Performance tuning of MR and YARN

Logging and troubleshooting YARN Jobs

YARN Capacity Scheduler

YARN Isolation

Hadoop Cluster: Planning and Managing

Learning Objectives - In this module, you will understand Planning and Managing a Hadoop Cluster, Hadoop Cluster Monitoring and Troubleshooting, Analysing logs, and Auditing. You will also understand Scheduling and Executing MapReduce Jobs, and different Schedulers.

Topics -Planning the Hadoop Cluster, Cluster Size, Hardware and Software considerations, Managing and Scheduling Jobs, types of schedulers in Hadoop, Configuring the schedulers and run MapReduce jobs, Cluster Monitoring and Troubleshooting

HDP3.1 and High Availability

Learning Objectives - In this module, you will understand Secondary NameNode setup and check pointing, HDP3.1 New Features, HDFS High Availability, YARN framework, and MRv2
Topics -Configuring Secondary NameNode, HDP3.1, YARN framework, MRv2, HDP3.1  Cluster setup, Deploying HDP3.1 in pseudo-distributed mode, deploying a multi-node HDP3.1  cluster

 

 

 Hadoop cluster Installation Using Hortonworks Data Patform Distribution in  AWS cloud

Hadoop Installation

Hadoop Configuration properties

Hadoop Installation and Initial Configuration

Deploying Hadoop in pseudo-distributed mode in local machine

Mutinode Cluster setup on AWS Cloud

Deploying a multi-node Hadoop cluster on AWS Cloud

Add and Remove Cluster Nodes

Zookeeper Configuration

Hadoop Architecture and Cluster setup

Learning Objectives - After this module, you will understand Multiple Hadoop Server roles such as NameNode and DataNode, and MapReduce data processing. You will also understand the HDP 3.1 Cluster setup and configuration, Setting up Hadoop Clients using HDP3.1, and important Hadoop configuration files and parameters.

Topics -Hadoop server roles and their usage, Rack Awareness, Anatomy of Write and Read, Replication Pipeline, Data Processing, Hadoop Installation and Initial Configuration, Deploying Hadoop in pseudo-distributed mode, deploying a multi-node Hadoop cluster, Installing Hadoop Clients

 

 

HDFS Lab: Understanding How blocks are created in HDFS and physical location of the blocks

HDFS Configuration properties walkthrough

Interfacing HDFS through Command line and Browser

Namenode UI

HDFS Shell commands to write ,read, delete files/directories. export data from HDFS to local file system.

Hadoop Cluster Administration

Learning Objectives - In this module, you will understand what is Big Data and Apache Hadoop, How Hadoop solves the Big Data problems, Hadoop Cluster Architecture, Introduction to MapReduce framework, Hadoop Data Loading techniques, and Role of a Hadoop Cluster Administrator.

Topics -Introduction to Big Data, Hadoop Architecture, MapReduce Framework, A typical Hadoop Cluster, Data Loading into HDFS, Hadoop Cluster Administrator: Roles and Responsibilities

Download Syllabus

Industry Project

Hortonworks (HDP) Hadoop Administration Advisor

Mukesh Kumar

Mukesh Kumar

Brief Introduction

Mukesh has overall 15 years of industry experience, started his career as Software project engineer and worked in different roles such as Project Lead, Software Architect and Enterprise Architect for over 12 years.  In the last 3 years, he hasworked as professional consultant and cooperate trainer for conducting workshop and training programs in the area of Big Data Analytics and helping client’s migrating their data platform and applications to Big Data platform to leverage the scalability and cost effectiveness of these platforms.

Hortonworks (HDP) Hadoop Administration Certification

Certificate Image

Why Tech Eureka

Tech Eureka's Blended Learning model brings classroom learning experience online with its world-class LMS. It combines instructor-led training, self-paced learning and personalized mentoring to provide an immersive learning experience

Classroom-in-Person

Self-Paced Online Video

Instructor led online

Hortonworks (HDP) Hadoop Administration FAQs