How big is BIG? Become a big data expert through an intensive training program customised across various levels designed specifically for you. It will make participants solve real-time problems with huge datasets. Through this intensive program we aim to train the participants in a way that they are prepared to appear for International Certifications as mentioned below:
There are 4 modules in BigData Specialization.
Introduction- Definition - DS in various fields - Examples - Impact of Data Science - Major Activities - Toolkit - Data Scientist - Compare with others - Data Science Team
Introduction to R : What is R - Data Science with other languages - Features of R - Environment - R at a glance.
Basics of R(Series & Ctrl Statements): Assignment - Modes - Operators - special numbers - Logical values - Basic Functions - Generating data sets - Control Structures.
Vectors:Definition- Declaration - Generating - Indexing - Naming - Adding & Removing elements - Operations on Vectors - Recycling - Special Operators - Functions for vectors - Missing values - NULL values - Filtering & subsetting. Exercises.
Descriptive Statistics: Introduction - Descriptive Statistics - Central Tendency - Variability - Mean - Median - Range - Variance - Summary-Exercises.
Graphics : : Introduction - Types - Packages - Basic graph - Histograms - Stem Leaf Graph - Box Plots - Scatter Plots - Bar Plots.
Arrays: Creating Arrays - Dimensions & Naming - Indexing & Naming - Functions on Arrays.
Matrices : Creating Matrices - Adding rows/columns - Removing rows/columns - Reshaping - Operations - Special functions.
Lists: Creating - Naming - Accessing elements - Adding - Removing - Special Functions - Recursive Lists.
Data frames: Creating - Naming - Accessing - Adding - Removing - Special functions - Merging Exercises.
Functions: Creating - Functions on Function Object - Scope of Variables - Accessing Global Environment - Closures - Recursion - Creating New Binary Operator.
Linear Regression: : Inferential Statistics - Types of Learning - Linear Regression- Simple Linear Regression - Coefficients - Confidence Interval - RSE - R2 - Implementation in R - lm - functions on lm - predict - Plotting - fitting regression line Exercises.
Multiple Linear Regression: Introduction- comparison with simple linear regression - Correlation Matrix - F Statistic - Response vs Predictors - Deciding important variable - Model fit - Predictions.
Generating a model - Interactive terms - Non Linear Transformations - Anova - lm with polynomial Exercises.
Classification & Logistic Regression : Classification - Examples - Logistic Regression Definition - Estimating coefficients - Predictions - Multiple Logistic Regression - More than 2 response classes - Implementation in R - glm - predict Exercises.
Classification: : Linear Discriminant Analysis - Quadratic Discriminant Analysis - K-Nearest Neighbors- Exercises.
Support Vector Machines: Maximal margin classifier - Support Vector Classifier - Support vector machine - SVM with more than 2 classes - Exercises.
Neural Networks : Introduction - Nodes & Weights - Layered Architecture - Learning Rule - Implementation in R - Normalizing data - Creating training data sets - Fitting Neural Network - neuralnet - Plotting NN - Predictions - Denormalize - MSE - Exercises.
Clustering: Unsupervised Learning, Principal Component Analysis(PCA), Clustering Methods: K-means, Exercises
Introduction Data, Storage, Bigdata, Distributed environment, Hadoop introduction History, Environment, Benefits - Subprojects HDFS, Map-Reduce, PIG, Hbase, Hive, Zoo-Keeper, SQOOP, Mahout, MongoDB, Hadoop DB.
Hadoop Architecture : Overall Architecture-NameNode - Datanode Fault Tolerance - Read&Write operations - Interfaces(Command line interface, JSP, API) - HDFS Shell - FS Shell Commands - Java API Programs.
Map-Reduce Introduction - Map-Reduce Architecture - Yarn Architecture - Basic M-R Programs - Detailed description of M-R Methods and exercises.
Rkey/value pairs - Different types of values from a mapper - GenericWritable - Custom values from mapper - Writable - Custom keys from Mapper - WritableComparable - Exercises.
Input format - FileInputFormat - Steps for Input - RecordReader - Custom FileInputFormat - Custom RecordReader - Exercise Output format - FileOutputFormat - RecordWriter - Custom FileOutputFormat -Custom RecordWriter.
Combiners - Partitioners - Secondary Sorting - Exercises.
Joins- various types - Reduce Side joins - Distributed Cache - Map-Side Join - Exercises.
Introduction - types of Data Ingestion - Ingesting Batch Data - Ingesting Streaming Data - Examples
Introduction - Sqoop Architecture - Connect to MySQL database - Sqoop - Import - Export - Eval - Joins - exercises.
Introduction - Flume Architecture - Flume master - Flume Agents - Flume Collectors - creation of Flume configuration files - Examples - Exercises
Introduction-Pig Data Flow Engine-Map Reduce Vs. Pig - Data Types-Basic Pig Programming-Modes of execution in PIG-Miscellaneous Commands - Group, Filter, Join, Order, Flatten, cogroup, Flatten, Illustrate, Explain - Parameter substitution- creating simple UDFs in Pig-Examples-Exercises.
Descriptive Statistics: Introduction - Descriptive Statistics - Central Tendency - Variability - Mean - Median - Range - Variance - Summary Exercises
Graphics : Introduction - Types - Packages - Basic graph - Histograms - Stem Leaf Graph - Box Plots - Scatter Plots - Bar Plots.
Introduction- Data types - variables - Control Structures-strings-classes-methods-objects
Traits, mixins, packages, lists, sets, maps, tuples
Understand the packaging, traits, collections, functional programming with scala. Able to write scala scripts for given problem.
Introduction - Motivation - Importance - Architecture - Interfaces - Basic Programs
Understand the overall concept of spark and able to write basic spark commands and small programs.
Introduction - Concept - Creating RDD - Loading Data from LFS & HDFS - Operations - Transformations - Actions - Persistence
Participant is able to handle various types of data sets and large files. Implement scripts on various RDD operations.
Introduction & Concept - Dataframe - Operations on Dataframes - SQL Application - Hive from Spark SQL - Reading & Writing from Hive Tables.
Expert level handling of data sets with SQL commands Implement scripts for complex data handling operations.
Basic knowledge of Java and Mysql will help.
Most learners are able to complete the Specialization in about four and a half months.
We recommend taking the courses in the order presented, as each subsequent course will build on material from previous courses.
Upon completion as a Big Data Specialization you will be able to help a company with the following:
This specialization will unlock great career opportunities as a Hadoop developer. Become a Hadoop expert by learning concepts like Pig, Hive, Flume and Sqoop. Get industry-ready with some of the best Big Data projects and real-life use-cases.
Big data technologies like Hadoop and cloud-based analytics provide substantial cost advantages.
Analytics has always involved attempts to improve decision making, and big data doesn't change that. Large organizations are seeking both faster and better decisions with big data, and they're finding them. Driven by the speed of Hadoop and in-memory analytics, several companies are now focused on speeding up existing decisions.
Perhaps the most interesting use of big data analytics is to create new products and services for customers. Online companies have done this for a decade or so, but now predominantly offline firms are doing it too.
You can pay by Credit Card, Debit Card or Net Banking from all the leading banks. We use a Payment Gateway.
No requirements are needed to learn R. The only knowledge that needed to learn R is basic statistical knowledge.
You can pay by Credit Card, Debit Card or Net Banking from all the leading banks. We use a Payment Gateway.
Basic knowledge of Java will help.
This specialization will unlock great career opportunities as a Hadoop developer. Become a Hadoop expert by learning concepts like Pig, Hive, Flume and Sqoop. Get industry-ready with some of the best Big Data projects and real-life use-cases.
Teleuniv is associated with Keshav Memorial Institute of Technology, one among the top performing colleges in Hyderabad and hence lot of recruitment firms contacts us for our students profiles from time to time. Since there is a big demand for this skill, we help our certified students get connected to prospective employers. Having said that, please understand that we don't guarantee any placements however if you go through the course diligently and complete the assignments and exercises you will have a very good chance of getting a job.