Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the tutor domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/nyasatjo/ma.nyasaproductions.com/wp-includes/functions.php on line 6131

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wp-whatsapp-chat domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/nyasatjo/ma.nyasaproductions.com/wp-includes/functions.php on line 6131

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpforms-lite domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/nyasatjo/ma.nyasaproductions.com/wp-includes/functions.php on line 6131

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wordpress-seo domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/nyasatjo/ma.nyasaproductions.com/wp-includes/functions.php on line 6131
Big Data Analytics - (Advanced) | Myra's Academy
Notice: Function WP_Styles::add was called incorrectly. The style with the handle "efor-learn-press" was enqueued with dependencies that are not registered: learn-press. Please see Debugging in WordPress for more information. (This message was added in version 6.9.1.) in /home2/nyasatjo/ma.nyasaproductions.com/wp-includes/functions.php on line 6131

+91 80083 60077

Big Data Analytics – (Advanced)

Categories: Software & Technology
Wishlist Share
Share Course
Page Link
Share On Social Media

About Course

Course Objective

Big Data is one of the most expediting and promising fields, considering the technologies available in the market today. To make the most of these opportunities, you need structured training with the latest curriculum as per current industry requirements and best practices.

Besides a strong theoretical understanding, you need to work on various real-world big data projects using different Big Data and Hadoop tools as a solution strategy. This Big Data Analytics course is curated to cover in-depth knowledge on Big Data and Hadoop Ecosystem tools such as HDFS, YARN, MapReduce, Hive, and Pig.

This course will help you gain a comprehensive understanding of various tools that fall in the Hadoop Ecosystem, like Pig, Hive, Sqoop, Flume, Oozie, and HBase.

Course Eligibility

There are no such prerequisites for Big Data & Hadoop Course. However, prior knowledge of Core Java and SQL will be helpful but is not mandatory.

Package Requisites

CloudLab environment that a browser could access.

Course Modules

Module 1: Understanding Big Data and Hadoop Learning Objectives

This module will understand

  1. What is Big Data
  2. The limitations of the traditional solutions for Big Data problems
  3. How Hadoop solves those Big Data problems
  4. Hadoop Ecosystem
  5. Hadoop Architecture
  6. HDFS
  7. Anatomy of File Read
  8. And Write & how Map Reduce works.

Module 2: Hadoop Architecture and HDFS

In this module, you will learn

  • Hadoop Cluster Architecture
  • Important configuration files of Hadoop Cluster
  • Data Loading Techniques using Sqoop & Flume
  • And how to set up Single Node and Multi-Node Hadoop Cluster.

Module 3: Hadoop MapReduce Framework

In this module, you will understand the

  • Hadoop MapReduce framework comprehensively
  • The working of MapReduce on data stored in HDFS.

You will also learn the advanced MapReduce concepts like Input Splits, Combiner & Partitioner.

Module 4: Advanced Hadoop MapReduce

In this module, you will learn advanced MapReduce concepts such as

  • Counters
  • Distributed Cache
  • MRunit, Reduce Join
  • Custom Input Format
  • Sequence Input
  • Format XML parsing.

Module 5: Apache Pig

In this module, you will learn

  • Apache Pig
  • Types of use cases where we can use Pig
  • Tight coupling between Pig and MapReduce
  • Pig Latin scripting
  • Pig running modes
  • Pig UDF
  • Pig Streaming & Testing
  • Pig Scripts.

You will also be working on a healthcare dataset.

Module 6: Apache Hive

This module will help you understand

  • Hive concepts
  • Hive Data types
  • Loading and Querying data in Hive
  • Running Hive Scripts
  • And Hive UDF.

Module 7: Advanced Apache Hive and HBase

In this module, you will understand

  • Advanced Apache Hive concepts such as UDF
  • Dynamic Partitioning
  • Hive Indexes and Views, and optimizations in Hive.

You will also acquire in-depth knowledge of Apache HBase, HBase Architecture, HBase running modes, and its components.

Module 8: Advanced Apache HBase

This module will cover advanced Apache HBase concepts. We will see demos on HBase Bulk Loading & HBase Filters. You will also learn what Zookeeper is all about, monitor a cluster, & why HBase uses Zookeeper.

Module 9: Processing Distributed Data with Apache Spark

In this module, you will learn what Apache Spark, SparkContext & Spark Ecosystem is. You will learn how to work in Resilient Distributed Datasets (RDD) in Apache Spark. You will be running the application on Spark Cluster & comparing the performance of MapReduce and Spark.

Outcomes

  • Understand MapReduce Framework
  • Implement complex business solutions using MapReduce
  • Learn data ingestion techniques using Sqoop and Flume
  • Perform ETL operations & data analytics using Pig and Hive
  • Implementing Partitioning, Bucketing, and Indexing in Hive
  • Understand HBase, i.e. a NoSQL Database in Hadoop, HBase Architecture & Mechanisms
  • Integrate HBase with Hive
  • Schedule jobs using Oozie
  • Implement best practices for Hadoop development
  • Understand Apache Spark and its Ecosystem
  • Learn how to work with RDD in Apache Spark
Show More
×