Date |
Topic |
Instructor |
Scriber |
02/05/2025, Wed |
Lecture 01: Syllabus, Principal Component Analysis, and Multidimensional Scaling
[ Class outline ]
[ PCA-MDS slides ]
[Homework 1]:
- Homework 1 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits.
|
Y.Y. |
|
02/07/2025, Fri |
Seminar.
[ Mathematics Colloquium ]
- Title: Theoretical Evaluation of Data Reconstruction Error and Induced Optimal Defenses [ announcement ]
- Speaker: Prof. Qi LEI, New York University
- Time: Friday Feb 7, 2025, 10:30am-noon
- Abstract: Data reconstruction attacks and defenses are crucial for understanding data leakage in machine learning and federated learning. However, previous research has largely focused on empirical observations of gradient inversion attacks, lacking a theoretical framework for quantitatively analyzing reconstruction errors based on model architecture and defense methods.
In this talk, we propose framing the problem as an inverse problem, enabling a theoretical and systematic evaluation of data reconstruction attacks. For various defense methods, we derive the algorithmic upper bounds and matching information-theoretical lower bounds on reconstruction error for two-layer neural networks, accounting for feature and architecture dimensions as well as defense strength. We further propose two defense strategies — Optimal Gradient Noise and Optimal Gradient Pruning — that maximize reconstruction error while maintaining model performance.
- Bio:
Qi Lei is an assistant professor of Mathematics and Data Science at the Courant Institute of Mathematical Sciences and the Center for Data Science at NYU. Previously she was an associate research scholar at the ECE department of Princeton University. She received her Ph.D. from Oden Institute for Computational Engineering & Sciences at UT Austin. She visited the Institute for Advanced Study (IAS)/Princeton for the Theoretical Machine Learning Program. Before that, she was a research fellow at Simons Institute for the Foundations of Deep Learning Program. Her research aims to develop mathematical groundings for trustworthy and (sample- and computationally) efficient machine learning algorithms. Qi has received several awards/recognitions, including Rising Stars in Machine Learning, in EECS, and in Statistics and Data Science, the Outstanding Dissertation Award, Computing Innovative Fellowship, and Simons-Berkeley Research Fellowship..
[ Relevant Reference ]:
- Zihan Wang, Jason D. Lee, Qi Lei. Reconstructing Training Data from Model Gradient, Provably [ link ]
- Sheng Liu*, Zihan Wang*, Yuxiao Chen, Qi Lei. Data Reconstruction Attacks and Defenses: A Systematic Evaluation. [ link ]
- Yuxiao Chen, Gamze Gürsoy, Qi Lei. Optimal Defenses Against Gradient Reconstruction Attacks. [ link ]
|
Y.Y. |
|
02/08/2025, Sat |
Lecture 02: Horn's Parallel Analysis and Random Matrix Theory for PCA, Sufficient Dimensionality Reduction and Supervised PCA (Chap 2: 3-5)
[ slides ] [ Sufficient Dimensionality Reduction and Supervised PCA ]
[Time and Venue]:
- 3:00PM - 5:50PM, G009A, CYT Bldg (80)
[Reference]:
- Horn's Parallel Analysis in R: [ paran.R ]
- Parallel Analysis in Matlab: [ papca.m ]
- Parallel Analysis in Python by LI, Zhen: [ paPCA_curve.py ] [ paPCA_image.py ]
- Marcenko-Pastur Law of Wishart matrices in Matlab: [ mp.m ]
- S&P500 dataset in class: [ snp500.Rda ] [ snp452-data.mat ] [ snp500.txt ]
- [Johnstone06] High dimensional statistical inference and random
matrices, ICM2006.
- [KN08 for multi-rank signal] S. Kritchman and B. Nadler, Determining the number of components in a factor model from limited noisy data, Chemometrics and Intelligent Laboratory Systems 94(1):19-32, 2008.
- [ NB10: multi-rank subspace ] R. R. Nadakuditi and F. Benaych-Georges, The breakdown point of signal subspace estimation, IEEE Sensor Array and Multichannel Signal Processing Workshop (2010), Jerusalem, Israel, 2010, pp. 177-180, doi: 10.1109/SAM.2010.5606726.
- Florent Benaych-Georges and Raj Rao Nadakuditi (2009) The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices.
- [Parallel Analysis: Horn (1965) original paper]
- [Remarks on Parallel Analysis: Buja-Eyuboglu (1992) with random permutation]
- [Raul Rabadan (2018)]: applications of RMT in single cell data analysis
- Dennis Cook, Fisher Lecture: Dimensionality Reduction in Regression. Statistical Science, 22(1):1-26, 2007.
- Ker-Chau Li, Sliced Inverse Regression for Dimension Reduction . Journal of the American Statistical Association, 86(414):316-327, 1991
- Wu, Liang, and Mukherjee. Localized Sliced Inverse Regression. NIPS 2009. [Matlab codes]
- Jiang B and Liu JS. (2014) Variable selection for general index models via sliced inverse regression. Annals of Statistics, 42:1751-1786. [ R codes ]
- Wolfgang Hardle and Leopold Simar. Applied Multivariate Statistical Analysis. Chapter 18.3: Sliced Inverse Regression.
[Homework 2]:
- Homework 2 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits
|
Y.Y. |
|
02/19/2025, Wed |
Lecture 03: High Dimensional Sample Mean: Inadmissibility of MLE and James-Stein Estimators (Chap 2: 1-2) [ slides ]
[Reference]:
- Comparing Maximum Likelihood Estimator and James-Stein Estimator in R: [ JSE.R ]
- Computer Age Statistical Inference, by Efron and Hastie, contains an empirical Bayes derivation of JSE in Section 7.1 and the baseball player example in 7.2: [ link ]
- James-Stein estimator via multi-task ridge regression: Yaqi Duan, Kaizheng Wang (2022) Adaptive and Robust Multi-task Learning, [ arXiv:2202.05250 ]
[Homework 3]:
- Homework 3 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits
|
Y.Y. |
|
02/26/2025, Wed |
Lecture 04: Random Projections, Johnson-Lindenstrauss Lemma, and Applications in Compressed Sensing etc. (Chap 3) [ slides ]
[Reference]:
- Joseph Salmon's lecture on Johnson-Lindenstrauss Theory [ JLlemma.pdf ]
- Random Projections in Scikit-learn: [ link ]
- Dennis Amelunxen, Martin Lotz, Michael B. McCoy, Joel A. Tropp. Living on the edge: Phase transitions in convex programs with random data. [ arXiv:1303.6672 ]
[Homework 4]:
- Homework 4 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits
|
Y.Y. |
|
03/05/2025, Wed |
Lecture 05: Robust PCA, Sparse PCA, and Graph Realization (MDS) with Uncertainty -- Semidefinite Programming approach [ slides ]
[Homework 5]:
- Homework 5 [pdf]. Just for fun, no grading; but I'll read and give bonus credits if you submitted.
|
Y.Y. |
|
03/12/2025, Wed |
Lecture 06: Introduction to Manifold Learning: ISOMAP and LLE (Modified LLE, Hessian LLE, and LTSA) [ slides ]
[Reference]:
- Tenenbaum, Joshua B., Vin De Silva, John C. Langford. A Global Geometric Framework for Nonlinear Dimensionality Reduction.
Science, Vol. 290, No. 5500, pp. 2319-2323. 22 Dec 2000. [ DOI: 10.1126/science.290.5500.2319 ]
- Roweis, Sam T. and Lawrence K. Saul. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, Vol 290, Issue 5500
pp. 2323-2326. 22 Dec 2000. [ DOI: 10.1126/science.290.5500.2323 ].
- Balasubramanian, Mukund and Eric L. Schwartz. The Isomap Algorithm and Topological Stability.
Science, Vol 295, Issue 5552, p.7. 4 Jan 2002. [ DOI: 10.1126/science.295.5552.7a ]
- V. de Silva and J.B. Tenenbaum. Global versus local methods in nonlinear dimensionality reduction. Neural Information Processing Systems 15 (NIPS 2002).
[NIPS 2002]
- Donoho, D. & Grimes, C. Hessian eigenmaps: Locally
linear embedding techniques for high-dimensional data.
Proc Natl Acad Sci U S A. 100:5591, 2003. [doi: 10.1073/pnas.1031596100]
- Zhang, Z. & Wang, J. MLLE: Modified Locally Linear
Embedding Using Multiple Weights. [ NIPS 2006 ]
- Zhang, Z. & Zha, H. (2005) Principal manifolds and nonlinear
dimensionality reduction via tangent space alignment.
SIAM Journal on Scientific Computing. 26 (1): 313-338.
[doi:10.1137/s1064827502419154]
[Matlab]:
- IsomapR1 : isomap codes by Tennenbaum, de Silva (isomapII.m with sparsity, fast mex with dijkstra.cpp and fibheap.h)
- lle.m : lle with k-nearest neighbors
- kcenter.m : k-center algorithm to find 'landmarks' in a metric space
|
Y.Y. |
|
03/19/2025, Wed |
Lecture 07: Manifold Learning II: Laplacian Eigenmap, Diffusion Map, and Stochastic Neighbor Embedding [ slides ]
[Reference]:
- Mikhail Belkin, Partha Niyogi. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering, Advances in Neural Information Processing Systems (NIPS) 14, 2001, p. 586-691, MIT Press
[nips link]
- R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. PNAS 102 (21):7426-7431, 2005. [doi: 10.1073/pnas.0500334102]
- Nadler, Boaz; Stephane Lafon; Ronald R. Coifman; Ioannis G. Kevrekidis (2005). Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators. Advances in Neural Information Processing Systems (NIPS) 18, 2005. [ .pdf ]
- Coifman, R.R.; S. Lafon. (2006). Diffusion maps. Applied and Computational Harmonic Analysis. 21: 5-30. [ DOI: 10.1016/j.acha.2006.04.006 ]
- Stochastic Neighbor Embedding [ .pdf ]
- Visualizing Data using t-SNE [ .pdf ]
- A paper that relates SNE to Laplacian Eigenmaps [ .pdf ]
- A helpful website: How to use t-SNE effectively? [ link ]
[Matlab]
- Matlab code to compare manifold learning algorithms [ mani.m ] : PCA, MDS, ISOMAP, LLE, Hessian LLE, LTSA, Laplacian, Diffusion (no SNE!)
|
Y.Y. |
|
03/26/2025, Wed |
Lecture 08: Random Walk on Graphs and Spectral Graph Theory: Perron-Frobenius (PageRank), Fiedler (Algebraic Connectivity), Cheeger Inequality (Normalized Cut), Lumpability (Spectral Clustering), and Transition Path Theory (Semi-supervised Learning) [ slides ]
[Reference]:
- Amy N. Langville and Carl D. Meyer's book: Google's PageRank and Beyond
- Jim Demmel's courseweb at UC Berkeley for Fiedler Theory and Graph Bipartition: [ link ]
- T. Buehler, M. Hein. Spectral Clustering based on the graph p-Laplacian. Proceedings of the 26th International Conference on Machine Learning (ICML 2009), 81-88.
- James R. Lee, Shayan Oveis Gharan, Luca Trevisan. Multi-way spectral partitioning and higher-order Cheeger inequalities. Proceedings of the forty-fourth annual ACM symposium on Theory of computing (STOC'12), Pages 1117-1130. arXiv:1111.1055.
- Weinan E, Jianfeng Lu, and Yuan Yao. The Landscape of Complex Networks: Critical Nodes and A Hierarchical Decomposition.
Methods and Applications of Analysis, special issue in honor of Professor Stanley Osher on his 70th birthday, 20(4):383-404, 2013.
|
Y.Y. |
|