Prerequisite |
Basic probability and statistics |
Instructor |
Dr. Ghofraniha, Jahan |
Starting Date |
3/24/2018 |
Complete Date |
5/12/2018 |
Lecturing time |
Tuesday 7:30 PM to 9:30 PM |
Sunday 1:30 pm to 5:30 PM |
Place |
1601 McCarthy Blvd., Milpitas, CA 95035 |
Contact |
info@cstu.org |
A. COURSE DESCRIPTION This course introduces methods and techniques for using stored data to make decisions. The student will learn data exploration and analysis and learn their patterns, associations, or relationships, and how to use these information for decision making. Fundamentals of Machine Learning such as regression, classification, decision trees, model reduction techniques such as principle component analysis, ensemble learning will be introduced. Specific examples of engineering and businesses using Machine Learning techniques will be given in the course.
The student is required to work on a course projects by using modern data analysis software and cases studies. This course will focus on implementation of ML algorithms using Python and Scikit-learn libraries.
B. COURSE OBJECTIVES
-
To learn how computational procedures and techniques are employed in machine learning.
-
To provide insights into the implementation details of machine learning strategies.
-
To gain hands-on experience with machine learning tools.
Textbook: An Introduction to Statistical Learning with Applications in R
- Series: Springer Texts in Statistics (Book 103)
- Hardcover: 426 pages
- Publisher: Springer; 1st ed. 2013, Corr. 5th printing 2015 edition (August 12, 2013)
- Language: English
- ISBN-10: 1461471370
- ISBN-13: 978-1461471370
- Hands-on Machine Learning with Scikit-Learn and TensorFlow
- Paperback: 576 pages
- Publisher: O'Reilly Media; 1 edition (April 9, 2017)
- Language: English
- ISBN-10: 1491962291
- ISBN-13: 978-1491962299
Week 1: Statistical Learning and machine learning overview
- Supervised vs. unsupervised learning
- Assessing model accuracy
- Introduction to Python and ML libraries
- Dealing with data
- Data visualization
- Data cleaning
- Selecting and training a model
- Basic exercise and homework using Python
Week 2: Classification
- Binary classification
- Multiclass classification
- Error analysis
- Cross validation
- Measuring accuracy using cross-validation
- Precision/Recall tradeoff
- In-class exercise
- Exercise/homework in Python and scikit-learn library
Week 3: Regression and other linear and quasi-linear model training
- Linear regression
- Computational complexity
- Gradient Descent
- Batch gradient descent
- Stochastic gradient descent
- Polynomial regression
- Learning curves
- Regularization
- Logistic regression
Hw: Midterm project announcement and discussion
Week 3: Tree-based Methods
- Basis decision trees
- Classification decision trees
- Regression decision trees
Hw: Application of decision trees in classification and regression using Python and Scikit-learn library
Week 4: Support Vector Machines
- Mathematical background, concept of hyperplane in n-dimension
- SVM classifier
- SVM with nonlinear decision boundaries
- SVM with more than two classes
- SVM example in Python and Scikit-learn
Hw: Using SVM for classification and regression in Python and Midterm project
Week 5: Ensemble learning and Random Forests
- Voting classifier
- Bagging and Pasting in Scikit-learn
- Out-of-bag Evaluation
- Random Forests
- Feature importance
- Boosting
- AdaBoost
- Gradient Boosting
- Stacking
Hw: Comparison of Random forests vs boosting and voting classifier in Python
Week 6: Dimensionality reduction
- Main Approaches for Dimensionality Reduction
- Projection
- Manifold learning
- PCA
- Preserving variance
- PCA for compression
- Incremental PCA
- Randomized PCA
- Kernel PCA
- Other techniques
Hw: PCA exercise using scikit-learn and final project announcement
Week 7: Introduction to Artificial Neural Networks
- From biology to Artificial Neurons
- The perceptron
- Multi-layer Perceptron and backpropagation
- Number of hidden layers
- Number of neurons per layer
- Activation function
- Implementation using Python
Week 8: Unsupervised Learning and final project presentations
- Clustering methods
- K-mean clustering
- Hierarchical clustering • Self-organizing Map
- Kohonen SOM
- SOM example
Hw: Final project presentation and Final exam
Tips:If you are unemployed, the government fund may cover the tuition. Please email info@cstu.org for more information.