HKUST

MATH 5470: Statistical Machine Learning
Spring 2022


Course Information

Synopsis

This course covers several topics in statistical machine learning:


Prerequisite: Some preliminary course on (statistical) machine learning, applied statistics, and deep learning will be helpful.

Instructors:

Yuan Yao

Time and Place:

MonWed 10:30-11:50am, Rm 2503, Lift 25-26 (87) and Zoom from CANVAS, HKUST

Reference (参考教材)

An Introduction to Statistical Learning, with applications in R (ISLR). By James, Witten, Hastie, and Tibshirani

ISLR-python, By Jordi Warmenhoven.

ISLR-Python: Labs and Applied, by Matt Caudill.

Manning: Deep Learning with Python, by Francois Chollet [GitHub source in Python 3.6 and Keras 2.0.8]

MIT: Deep Learning, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville

Tutorials: preparation for beginners

Python-Numpy Tutorials by Justin Johnson

scikit-learn Tutorials: An Introduction of Machine Learning in Python

Jupyter Notebook Tutorials

PyTorch Tutorials

Deep Learning: Do-it-yourself with PyTorch, A course at ENS

Tensorflow Tutorials

MXNet Tutorials

Theano Tutorials

The Elements of Statistical Learning (ESL). 2nd Ed. By Hastie, Tibshirani, and Friedman

statlearning-notebooks, by Sujit Pal, Python implementations of the R labs for the StatLearning: Statistical Learning online course from Stanford taught by Profs Trevor Hastie and Rob Tibshirani.

Homework and Projects:

TBA (To Be Announced)

Schedule

Date Topic Instructor Scriber
07/02/2022, Mon Lecture 01: A Historic Overview. [ slides (pdf) ] Y.Y.
09/02/2022, Wed Lecture 02: An Introduction to Supervised Learning [ slides ] Y.Y.
14/02/2022, Mon Lecture 03: Supervised Learning, linear regression and classification [ slides ] Y.Y.
16/02/2022, Wed Lecture 04: Linear regression and classification [ slides ] Y.Y.
21/02/2022, Mon Lecture 05: Model Assessment and Selection [ slides ]
Y.Y.
23/02/2022, Wed Lecture 06: Model Assessment and Selection: Subset, Ridge, Lasso, and PCR [ slides ]
Y.Y.
28/02/2022, Mon Lecture 07: Moving beyond Linearity [ slides ]
Y.Y.
02/03/2022, Wed Lecture 08: Moving beyond Linearity [ slides ]
Y.Y.
07/03/2022, Mon Lecture 09: Decision Tree, Bagging, Random Forests and Boosting [ YY's slides ]
Y.Y.
09/03/2022, Wed Lecture 10: Decision Tree, Bagging, Random Forests and Boosting [ YY's slides ]
Y.Y.
14/03/2022, Mon Lecture 11: Support Vector Machines [ YY's slides ]
Y.Y.
16/03/2022, Wed Lecture 12: Support Vector Machines [ YY's slides ]
    [Reference]:
  • To view .ipynb files below, you may try [ Jupyter NBViewer]
  • Python Notebook for Support Vector Machines [ svm.ipynb ]
  • Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro. The Implicit Bias of Gradient Descent on Separable Data. [ arXiv:1710.10345 ]. ICLR 2018. Gradient descent on logistic regression leads to max margin.
  • Matus Telgarsky. Margins, Shrinkage, and Boosting. [ arXiv:1303.4172 ]. ICML 2013. An older paper on gradient descent on exponential/logistic loss leads to max margin.
Y.Y.
21/03/2022, Mon Lecture 13: An Introduction to Convolutional Neural Networks [ YY's slides ]
Y.Y.
23/03/2022, Wed Lecture 14: Topics on Convolutional Neural Networks [ YY's slides ] and Final Project Assignment [ project.pdf ]
    [Reading Material]:
  • Shihao Gu, Bryan Kelly and Dacheng Xiu
    "Empirical Asset Pricing via Machine Learning", Review of Financial Studies, Vol. 33, Issue 5, (2020), 2223-2273. Winner of the 2018 Swiss Finance Institute Outstanding Paper Award.
    [ link ]

  • Jingwen Jiang, Bryan Kelly and Dacheng Xiu
    "(Re-)Imag(in)ing Price Trends", Chicago Booth Report, Aug 2021
    [ link ]

    [ Reference ]:
  • Kaggle: Home Credit Default Risk [ link ]
  • Kaggle: M5 Forecasting - Accuracy, Estimate the unit sales of Walmart retail goods. [ link ]
  • Kaggle: M5 Forecasting - Uncertainty, Estimate the uncertainty distribution of Walmart unit sales. [ link ]
  • Kaggle: Ubiquant Market Prediction - Make predictions against future market data. [ link ]
  • Kaggle: G-Research Crypto Forecasting. [ link ]
  • Type-II diabetes and Alzheimer’s disease. [ slides (pdf) ] [ slides (pptx) ]
Y.Y.
28/03/2022, Mon Lecture 15: An Introduction to Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM) [ YY's slides ]
Y.Y.
30/03/2022, Wed Lecture 16: An Introduction to Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM) [ YY's slides ]
Y.Y.
04/04/2022, Mon Lecture 17: Attention, Transformer and BERT [ slides ]
    [ Presentation ]
  • Final Project Proposal: Detect the disrupted brain connectivity in type-II diabetes patients [ slides ]
  • Speaker: WEI, Yue and XIE, Weiyan
Y.Y.
06/04/2022, Wed Lecture 18: Attention, Transformer and BERT [ slides ]
Y.Y.
11/04/2022, Mon Lecture 19: An Introduction to Unsupervised Learning: PCA, AutoEncoder, VAE, and GANs [ slides ]
Y.Y.
20/04/2022, Wed Lecture 20: Robust Statistics and Generative Adversarial Networks [ slides ]
Y.Y.
25/04/2022, Mon Lecture 21: An Introduction to Self-supervised Learning [ slides ]
Y.Y.
27/04/2022, Wed Seminar: Conformal Prediction
  • Speaker: Prof. Emmanuel Candès, Stanford University
  • Abstract: Recent progress in machine learning provides us with many potentially effective tools to learn from datasets of ever-increasing sizes and make useful predictions. How do we know that these tools can be trusted in critical and highly-sensitive domains? If a learning algorithm predicts the GPA of a prospective college applicant, what guarantees do we have concerning the accuracy of this prediction? How do we know that it is not biased against certain groups of applicants? To address questions of this kind, this talk reviews a wonderful field of research known under the name of conformal inference/prediction, pioneered by Vladimir Vovk and his colleagues 20 years ago. After reviewing some of the basic ideas underlying distribution-free predictive inference, we shall survey recent progress in the field touching upon several issues: (1) efficiency: how can we provide tighter predictions?, (2) data-reuse: what do we do when data is scarce? (3) algorithmic fairness: how do we make sure that learned models apply to individuals in an equitable manner?, and (4) causal inference: can we predict the counterfactual response to a treatment given that the patient was not treated?
  • This is the keynote talk at the Bernoulli-IMS One World Symposium on 27 Aug 2020. [ link ]
    [ Reference ]
  • Alex Gammerman, Volodya Vovk, Vladimir Vapnik. Learning by Transduction. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998). 1998. [ arXiv:1301.7375 ]
  • Glenn Shafer, Vladimir Vovk, A Tutorial on Conformal Prediction. Journal of Machine Learning Research 9 (2008) 371-421. [ link ]
  • Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman. Distribution-Free Predictive Inference for Regression. Journal of the American Statistical Association, 2018, 113(523):1094-1111. [ link ][ arXiv:1604.04173 ]
  • Y. Romano, E. Patterson, E. J. Candès. Conformalized quantile regression. Advances in neural information processing systems 32 (NIPS), 2019. [ arXiv:1905.03222 ]
  • R. F. Barber, E. J. Candès, A. Ramdas, R. J. Tibshirani. Predictive inference with the jackknife+. Ann. Statist.. 2021. [ arXiv:1905.02928 ]
  • Y. Romano, M. Sesia, E. J. Candès. Classification with valid and adaptive coverage. Advances in neural information processing systems 33 (neurips 2020). 2020. [ arXiv:2006.02544 ]
  • I. Gibbs, E. Candès. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems 34 (NeurIPS 2021). 2021. [ arXiv:2106.00170 ]
  • L. Lei, E. J. Candès. Conformal inference of counterfactuals and individual treatment effects. Journal of the Royal Statistical Society Series B. 2021. [ arXiv:2006.06138 ]
  • R. F. Barber, E. J. Candès, A. Ramdas, R. J. Tibshirani. Conformal prediction beyond exchangeability. 2022. [ arXiv:2202.13415 ]
Y.Y.
04/05/2022, Wed Final Project Presentations.
Y.Y.
11/05/2022, Wed Final Project Presentations.
    [ Groups ]
  • LI Jiabao, ZHU Zhihan.
    Comparison of models for Ubiquant Market Prediction.
  • WEI Xinyi, KUANG Liangyawei.
    Home Credit Default Risk.
  • WEI Yue, XIE Weiyan.
    Detect the Disrupted Brain Connectivity in Type-II Diabetes Patients.
  • CAO, Zhefeng and CHU, Mengyuan.
    Empirical Asset Pricing via Machine Learning.
    [ Final Report Collection ]
  • Description of Final Project: [ pdf ]
  • GitHub Repository for reports of Final Project [ GitHub ]

    [Reading Material]:
  • Shihao Gu, Bryan Kelly and Dacheng Xiu
    "Empirical Asset Pricing via Machine Learning", Review of Financial Studies, Vol. 33, Issue 5, (2020), 2223-2273. Winner of the 2018 Swiss Finance Institute Outstanding Paper Award.
    [ link ]

  • Jingwen Jiang, Bryan Kelly and Dacheng Xiu
    "(Re-)Imag(in)ing Price Trends", Chicago Booth Report, Aug 2021
    [ link ]

    [ Reference ]:
  • Kaggle: Home Credit Default Risk [ link ]
  • Kaggle: M5 Forecasting - Accuracy, Estimate the unit sales of Walmart retail goods. [ link ]
  • Kaggle: M5 Forecasting - Uncertainty, Estimate the uncertainty distribution of Walmart unit sales. [ link ]
  • Kaggle: Ubiquant Market Prediction - Make predictions against future market data. [ link ]
  • Kaggle: G-Research Crypto Forecasting. [ link ]
  • Type-II diabetes and Alzheimer’s disease. [ slides (pdf) ] [ slides (pptx) ]
Y.Y.

by YAO, Yuan.