# Data Science Master Program

Reading Time: 5 minutes

Statistics Essentials for Analytics

All the topics in the following section will explain the basics of what it is, which scenario you want to use, What math behind it, How to implement with an analytic tool, what inferences you are getting from the final result.

• Understanding the Data
• Probability and its Uses
• Statistical Inference
• Data Clustering
• Testing the Data
• Regression Modelling
Data Science with Python

Module 1: Introduction to Data Science

• What is Data Science?
• What is Machine Learning?
• What is Deep Learning?
• What is AI?
• Data Analytics & it’s types

Module 2: Introduction to Python

• What is Python?
• Why Python?
• Installing Python
• Python IDEs
• Jupyter Notebook Overview

Module 3: Python Basics

• Python Basic Data types
• Lists
• Slicing
• IF statements
• Loops
• Dictionaries
• Tuples
• Functions
• Array
• Selection by position & Labels

Module 4: Python Packages

• Pandas
• Numpy
• Sci-kit Learn
• Mat-plot library

Module 5: Importing data

• Reading CSV files
• Saving in Python data
• Writing data to csv file

Module 6: Manipulating Data

• Selecting rows/observations
• Rounding Number
• Selecting columns/fields
• Merging data
• Data aggregation
• Data munging techniques

Module 7: Statistics Basics

• Central TendencyMean

Median

Mode

Skewness

Normal Distribution

• Probability BasicsWhat does mean by probability?

Types of Probability

ODDS Ratio?

• Standard DeviationData deviation & distribution

Variance

• Bias variance Trade offUnderfitting

Overfitting

• Distance metricsEuclidean Distance

Manhattan Distance

• Outlier analysisWhat is an Outlier?

Inter Quartile Range

Box & whisker plot

Upper Whisker

Lower Whisker

catter plot

Cook’s Distance

• Missing Value treatmentsWhat is a NA?

Central Imputation

KNN imputation

Dummification

• CorrelationPearson correlation

Positive & Negative correlation

• Error MetricsClassification

Confusion Matrix

Precision

Recall

Specificity

F1 Score

• RegressionMSE

RMSE

MAPE

Module 8: Machine Learning

Module 9: Supervised Learning

• Linear RegressionLinear Equation

Slope

Intercept

R square value

• Logistic regressionODDS ratio

Probability of success

Probability of failure

ROC curve

Module 10: Unsupervised Learning

• K-Means
• K-Means ++
• Hierarchical Clustering

Module 11: Other Machine Learning algorithms

• K – Nearest Neighbour
• Naïve Bayes Classifier
• Decision Tree – CART
• Decision Tree – C50
• Random Forest
Data Science with R Language

Module 1: Introduction to Data Science Methodologies

• Data Types
• Introduction to Data Science Tools
• Statistics
• Approach to Business Problems
• Numerical Categorical
• R, Python, WEKA, RapidMiner

Module 2: Correlation / AssociationRegressionCategorical variables

• Introduction to Correlation Spearman Rank Correlation
• OLS Regression – Simple and Multiple Dummy variables
• Multiple regression
• Assumptions violation – MLE estimates
• Using UCI ML repository dataset or Built-in R dataset

Module 3: Data Preparation

• Data preparation & Variable identification
• Parameter Estimation / Interpretation
• Robust Regression
• Accuracy in Parameter Estimation
• Using UCI ML repository dataset or Built-in R dataset

Module 4: Logistic Regression

• Introduction to Logistic Regression
• Logit Function
• Training-Validation approach
• Lift charts
• Decile Analysis
• Using UCI ML repository dataset or Built-in R dataset

Module 5: Cluster AnalysisClassification Models

• Introduction to Cluster Techniques
• Distance Methodologies
• Hierarchical and Non-Hierarchical Procedure
• K-Means clustering
• Introduction to decision trees/segmentation with Case Study
• Using UCI ML repository dataset or Built-in R dataset

Module 6: Introduction and to Forecasting Techniques

• Introduction to Time Series
• Data and Analysis
• Decomposition of Time Series
• Trend and Seasonality detection and forecasting
• Exponential Smoothing
• Building R Dataset
• Sales forecasting Case Study

Module 7: Advanced Time Series Modeling

• Box – Jenkins Methodology
• Introduction to Auto Regression and Moving Averages, ACF, PACF
• Detecting order of ARIMA processes
• Seasonal ARIMA Models (P,D,Q)(p,d,q)
• Introduction to Multivariate Time-series Analysis
• Using built-in R datasets

Module 8: Stock market prediction

• Live example/ live project
• Using client given stock prices / taking stock price data

Module 9: Pharmaceuticals

• Case Study with the Data
• Based on open set data

Module 10: Market Research

• Case Study with the Data
• Based on open set data

Module 11: Machine Learning

• Supervised Learning Techniques
• Conceptual Overview
• Unsupervised Learning Techniques
• Association Rule Mining Segmentation

Module 12: Fraud Analytics

• Fraud Identification Process in Parts procuring
• Sample data from online
• Text Analytics

Module 13: Text Analytics

• Sample text from online

Module 14: Social Media Analytics

• Social Media Analytics
• Sample text from online
Tableau

Module 1: Tableau Course Material

• Start Page
• Show Me
• Connecting to Excel Files
• Connecting to Text Files
• Connect to Microsoft SQL Server
• Connecting to Microsoft Analysis Services
• Creating and Removing Hierarchies
• Bins
• Joining Tables
• Data Blending

Module 2: Learn Tableau Basic Reports

• Parameters
• Grouping Example 1
• Grouping Example 2
• Edit Groups
• Set
• Combined Sets
• Creating a First Report
• Data Labels
• Create Folders
• Sorting Data
• Add Totals, Sub Totals and Grand Totals to Report

Module 3: Learn Tableau Charts

• Area Chart
• Bar Chart
• Box Plot
• Bubble Chart
• Bump Chart
• Bullet Graph
• Circle Views
• Dual Combination Chart
• Dual Lines Chart
• Funnel Chart
• Traditional Funnel Charts
• Gantt Chart
• Grouped Bar or Side by Side Bars Chart
• Heatmap
• Highlight Table
• Histogram
• Cumulative Histogram
• Line Chart
• Lollipop Chart
• Pareto Chart
• Pie Chart
• Scatter Plot
• Stacked Bar Chart
• Text Label
• Tree Map
• Word Cloud
• Waterfall Chart

Module 4: Learn Tableau Advanced Reports

• Dual Axis Reports
• Blended Axis
• Individual Axis
• Add Reference Lines
• Reference Bands
• Reference Distributions
• Basic Maps
• Symbol Map
• Use Google Maps
• Mapbox Maps as a Background Map
• WMS Server Map as a Background Map

Module 5: Learn Tableau Calculations & Filters

• Calculated Fields
• Basic Approach to Calculate Rank
• Advanced Approach to Calculate Ra
• Calculating Running Total
• Filters Introduction
• Quick Filters
• Filters on Dimensions
• Conditional Filters
• Top and Bottom Filters
• Filters on Measures
• Context Filters
• Slicing Fliters
• Data Source Filters
• Extract Filters

Module 6: Learn Tableau Dashboards

• Create a Dashboard
• Format Dashboard Layou
• Create a Device Preview of a Dashboard
• Create Filters on Dashboard
• Dashboard Objects
• Create a Story

Module 7: Server

• Tableau online.
• Overview of Tableau Server.
• Publishing Tableau objects and scheduling/subscription.