Data Science Online Training
Data Science Online Training Course Content
Introduction to Python Programming
- Data
- Big Data
- Data Science Deep Dive
- Intro to R Programming
- R Programming Concepts
- Data Manipulation in R
- Data Import Techniques in R
- Exploratory Data Analysis (EDA) using R
- Introduction to Data Science
- Introduction to Python
- Basic Operations in Python
- Variable Assignment & Examples
- Functions: in-built functions, user defined functions & Examples
- Condition: if, if-else, nested if-else, else-if & Examples
Data Structure’s
- Introduction DS
- List Operations
- Different Data Types in a List,List in a List & with Examples
- Operations on a list: Slicing, Splicing, Sub-setting
- Conditions(True / False) on a List
- Applying Functions on a List
- Dictionary: Index, Value
- Operation on a Dictionary: Slicing, Splicing, Sub-setting
- Condition(True / False) on a Dictionary
- Applying functions on a Dictionary
- Numpy Array: Data Types in an Array, Dimensions of an Array & with Examples
- Operations on Array: Slicing, Splicing, Sub-setting
- Conditional(True / False) on an Array
- Loops: For, While with Examples
- Shorthand for For with Examples
- Conditions in shorthand for For & with Examples
Basics of Statistics
- Introduction of Statistics & Plotting
- Introduction of Seabourn & Matplotlib
- Univariate Analysis on a Data
- Plot the Data – Histogram plot
- Find the distribution
- Find mean, median and mode of the Data
- Multiple Data with Same Mean with different sd, same mean & SD but different kurtosis: find mean, sd, plot & Examples
- Multiple data with different distributions
- Bootstrapping and sub-setting
- Making samples from the Data
- Making stratified samples – covered in bivariate analysis
- Find the mean of sample
- Central limit theorem
- Plotting
- Hypothesis testing + DOE
- Bivariate analysis
- Correlation
- Scatter plots
- Making stratified samples
- Categorical variables
- Class variable
- Use of Pandas
- File I/O
- Series: Data Types in series, Index
- Data Frame
- Series to Data Frame
- Re-indexing
- Operations on Data Frame: Slicing, Splicing (also Alternate), Sub-setting
- Pandas
- Stat operations on Data Frame
- Reading from different sources
- Missing data treatment
- Merge, join
- Options for look and feel of data frame
- Writing to file
- db operations
Data Manipulation & Visualization
- Data Aggregation, Filtering and Transforming
- Lamda Functions
- Apply, Group-by , Map, Filter and Reduce
- Visualization
- Matplotlib, pyplot & Seaborn
- Scatter plot, histogram, density, heat-map, bar charts
- Linear Regression
- Regression – Introduction
- Linear Regression: Lasso, Ridge
- Variable Selection
- Forward & Backward Regression
Logistic Regression
- Logistic Regression: Lasso, Ridge
- Naive Bayes
- Unsupervised Learning – Introduction
- Distance Concepts , Classification , k nearest
- Clustering, k means,Multidimensional Scaling
- PCA
- Random Forest
- Decision trees
- Cart C4.5
- Random Forest
- Boosted Trees
- Gradient Boosting
- SVM
- SVM – Introduction
- Hyper-plane
- Hyper-plane to segregate to classes
- Gamma
- Data Visualization in R
- Big Data and Hadoop Introduction
- Understand Hadoop Cluster Architecture
- Map Reduce Concepts
- Advanced Map Reduce Concepts
- Hadoop 2.0 & YARN
- PIG
- HIVE
- HBASE
- SQOOP
- Flume & Oozie
- Statistics + Machine Learning
- Python
- Machine Learning Using Python
- Projects