HKUST

MATH 5470: Statistical Machine Learning
Spring 2025


Course Information

Synopsis

This course covers several topics in statistical machine learning:


Prerequisite: Some preliminary course on (statistical) machine learning, applied statistics, and deep learning will be helpful.

Instructors:

Yuan Yao

Time and Place:

Mon 6:30-9:20pm, G009A, CYT Bldg, HKUST

Reference (参考教材)

An Introduction to Statistical Learning, with applications in R (ISLR). By James, Witten, Hastie, and Tibshirani

ISLR-python, By Jordi Warmenhoven.

ISLR-Python: Labs and Applied, by Matt Caudill.

Manning: Deep Learning with Python, by Francois Chollet [GitHub source in Python 3.6 and Keras 2.0.8]

MIT: Deep Learning, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville

Tutorials: preparation for beginners

Python-Numpy Tutorials by Justin Johnson

scikit-learn Tutorials: An Introduction of Machine Learning in Python

Jupyter Notebook Tutorials

PyTorch Tutorials

Deep Learning: Do-it-yourself with PyTorch, A course at ENS

Tensorflow Tutorials

MXNet Tutorials

Theano Tutorials

The Elements of Statistical Learning (ESL). 2nd Ed. By Hastie, Tibshirani, and Friedman

statlearning-notebooks, by Sujit Pal, Python implementations of the R labs for the StatLearning: Statistical Learning online course from Stanford taught by Profs Trevor Hastie and Rob Tibshirani.

Homework and Projects:

TBA (To Be Announced)

Schedule

Date Topic Instructor Scriber
03/02/2025, Mon Lecture 01: A Historic Overview and Introduction to Supervised Learning. [ slides (pdf) ] Y.Y.
07/02/2025, Fri Seminar.
    [ Mathematics Colloquium ]
  • Title: Theoretical Evaluation of Data Reconstruction Error and Induced Optimal Defenses [ announcement ]
  • Speaker: Prof. Qi LEI, New York University
  • Time: Friday Feb 7, 2025, 10:30am-noon
  • Abstract: Data reconstruction attacks and defenses are crucial for understanding data leakage in machine learning and federated learning. However, previous research has largely focused on empirical observations of gradient inversion attacks, lacking a theoretical framework for quantitatively analyzing reconstruction errors based on model architecture and defense methods. In this talk, we propose framing the problem as an inverse problem, enabling a theoretical and systematic evaluation of data reconstruction attacks. For various defense methods, we derive the algorithmic upper bounds and matching information-theoretical lower bounds on reconstruction error for two-layer neural networks, accounting for feature and architecture dimensions as well as defense strength. We further propose two defense strategies — Optimal Gradient Noise and Optimal Gradient Pruning — that maximize reconstruction error while maintaining model performance.
  • Bio: Qi Lei is an assistant professor of Mathematics and Data Science at the Courant Institute of Mathematical Sciences and the Center for Data Science at NYU. Previously she was an associate research scholar at the ECE department of Princeton University. She received her Ph.D. from Oden Institute for Computational Engineering & Sciences at UT Austin. She visited the Institute for Advanced Study (IAS)/Princeton for the Theoretical Machine Learning Program. Before that, she was a research fellow at Simons Institute for the Foundations of Deep Learning Program. Her research aims to develop mathematical groundings for trustworthy and (sample- and computationally) efficient machine learning algorithms. Qi has received several awards/recognitions, including Rising Stars in Machine Learning, in EECS, and in Statistics and Data Science, the Outstanding Dissertation Award, Computing Innovative Fellowship, and Simons-Berkeley Research Fellowship..
    [ Relevant Reference ]:
  • Zihan Wang, Jason D. Lee, Qi Lei. Reconstructing Training Data from Model Gradient, Provably [ link ]
  • Sheng Liu*, Zihan Wang*, Yuxiao Chen, Qi Lei. Data Reconstruction Attacks and Defenses: A Systematic Evaluation. [ link ]
  • Yuxiao Chen, Gamze Gürsoy, Qi Lei. Optimal Defenses Against Gradient Reconstruction Attacks. [ link ]
Y.Y.

by YAO, Yuan.