Big Data Analytics - (Basic)

About Course

Objective

In this course, you will learn how big data is driving organisational change and the key challenges organizations face when trying to analyse massive data sets. This course focuses on learning fundamental techniques, such as data mining and stream processing. You will also learn how to design and implement PageRank algorithms using MapReduce, a programming paradigm that allows for massive scalability across hundreds or thousands of servers in a Hadoop cluster. You will learn how big data has improved web search and how online advertising systems work.

By the end of this course, you will have a better understanding of the various applications of big data methods in industry and research.

Eligibility

Candidates interested must have with prior knowledge in any programming language, Data Structures and Algorithms and SQL. This course is more suitable for freshers who seek for a fundamental understanding of Big Data.

Package Requisites

Software- Apache Hadoop, Java Version 1.8

Modules

Module 1: Basics and Characteristics of Big Data and Dimensions of Scalability

Understand the four V’s of Big Data (Volume, Velocity, and Variety)
Build models for data
Understand the occurrence of rare events in random data.

Module 2: Web and social networks

Understand characteristics of the web and social networks
Model social networks
Apply algorithms for community detection in networks.

Module 3: Clustering big data

Clustering social networks
Apply hierarchical clustering
Apply k-means clustering.

Module 4: Google web search

Understand the concept of PageRank
Implement the basic
PageRank algorithm for strongly connected graphs
Implement PageRank with taxation for graphs that are not strongly connected.

Module 5: Parallel and distributed computing using MapReduce

Understand the architecture for massive distributed and parallel computing
Apply MapReduce using Hadoop
Compute PageRank using MapReduce.

Module 6: Computing similar documents in big data

Measure importance of words in a collection of documents
Measure similarity of sets and documents
Apply local sensitivity hashing to compute similar documents.
Module 7: Products frequently bought together in stores (2 Hours)
Understand the importance of frequent item sets
Design association rules; Implement the A- Priori algorithm.

Module 8: Movie and music recommendations

Understand the differences of recommendation systems
Design content-based recommendation systems
Design collaborative filtering recommendation systems.

Module 9: Google’s AdWordsTM System

Understand the AdWords System
Analyse online algorithms in terms of competitive ratio
Use online matching to solve the AdWords problem.

Module 10: Mining rapidly arriving data streams

Understand types of queries for data streams
Analyse sampling methods for data streams
Count distinct elements in data streams
Filter data streams.

Outcome

Basic knowledge of Big Data
Candidates will be able to navigate through Hadoop
Applying tools like MapReduce on Hadoop

+91 80083 60077

Myra's Academy