CSIC 5011: Topological and Geometric Data Reduction and Visualization

HKUST

CSIC 5011: Topological and Geometric Data Reduction and Visualization
Spring 2019

Course Information

Synopsis (摘要)

This course is open to graduates and senior undergraduates in applied mathematics, statistics, and engineering, who are interested in learning from data. Students with other backgrounds such as life sciences are also welcome, provided you have certain maturity of mathematics. It will cover wide topics in geometric (principal component analysis and manifold learning, etc.) and topological data reduction (clustering and computational homology group, etc.).
Prerequisite: linear and abstract algebra, basic probability and multivariate statistics, basic stochastic process (Markov chains), convex optimization; familiarity with Matlab, R, and/or Python, etc.

Reference (参考教材)

[pdf download]

Instructors:

Yuan YAO

Time and Place:

Wednesday and Friday 3:00-4:20pm, CYT G009A
This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.
Find our class page at: https://piazza.com/ust.hk/spring2019/csic5011/home

Homework and Projects:

Weekly homeworks (no grader, but I'll read your submissions and give bonus credits), monthly mini-projects, and a final major project. No final exam.
Email: datascience.hw (add "AT gmail DOT com" afterwards)

Schedule (时间表)

Date	Topic	Instructor	Scriber
01/30/2019, Wed	Lecture 01: Introduction to Geometric and Topological Data Reduction [ syllabus ]	Y.Y.
02/01/2019, Fri	Seminar: Safety Enhanced Reinforcement Learning [ Speaker ]: Ruohan ZHAN, Stanford University [ Abstract ]: Model-free reinforcement learning (RL) has shown great potential in finding policies maximizing accumulated return in uncertain environments. However, safety guarantees remain hard to derive, preventing the use of RL in many domains. In this work, we propose a model-free algorithm to learn a safety mask in an unknown environment modeled as a Markov decision process (MDP). The safety mask provides a quantitative measure of safety for each state-action pair of the MDP, and is learned using deep reinforcement learning without requiring external knowledge. The safety measure is then used to prevent unsafe actions during later RL algorithm whose objective is to maximize the return. We demonstrate that our method can scale to large domains and our learned mask can enhance safety in both later RL policy learning and deployment processes. This is joint work with Maxime Bouton and Mykel Kochenderfer.	Y.Y.
02/13/2019, Wed	Lecture 02: Principal Component Analysis [ Lecture02.key ] [Reference]: To view .jpynb files below, you may try [ Jupyter NBViewer] PCA in iPython Notebook [ pca.ipynb ] [ pca.py ] PCA with Logistic regression for digit classification: [ pca_logistic.ipynb ] [ pca_logistic.py ]	Y.Y.
02/15/2019, Fri	Lecture 03: Multidimensional Scaling [ Lecture02.key ] [Reference]: MDS in Python [ scikit-learn MDS] PCA with Logistic regression for digit classification: [ pca_logistic.ipynb ] [ pca_logistic.py ] [Homework 1]: Homework 1 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits.	Y.Y.
02/20/2019, Wed	Lecture 04: High Dimensional PCA and Random Projections (Chap 2) [ Lecture04.pdf ] [Reference]: Joseph Salmon's lecture on Johnson-Lindenstrauss Theory [ JLlemma.pdf ]	Y.Y.
02/22/2019, Fri	Lecture 05: Compressed Sensing and Random Projections (Chap 2: 4) [Homework 2]: Homework 2 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits.	Y.Y.
02/27/2019, Wed	Lecture 06: Sample Mean as MLE? James-Stein Estimator and Shrinkages (Chap 3: 1-2)	Y.Y.
03/01/2019, Fri	Lecture 07: Lasso, Nonconvex shrinkage, differential inclusions (Chap 3: 1-2)[ new lecture notes updated ] [Reference]: Comparing Maximum Likelihood Estimator and James-Stein Estimator in R: [ JSE.R ]	Y.Y.
03/06/2019, Mon	Lecture 07: Random Matrix Theory for PCA and Horn's Parallel Analysis (Chap 2: 3) [Reference]: Marcenko-Pastur Law of Wishart matrices in Matlab: [ mp.m ] Horn's Parallel Analysis in R: [ paran.R ] Parallel Analysis in Matlab: [ papca.m ] Parallel Analysis in Python by LI, Zhen: [ paPCA_curve.py ] [ paPCA_image.py ] S&P500 dataset in class: [ snp500.Rda ] [ snp452-data.mat ] [ snp500.txt ] [Johnstone06] High dimensional statistical inference and random matrices, ICM2006. Florent Benaych-Georges and Raj Rao Nadakuditi (2009) The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. [Parallel Analysis: Horn (1965) original paper] [Parallel Analysis: Buja-Eyuboglu (1992) with random permutation] [Raul Rabadan (2018)]: applications of RMT in single cell data analysis [Homework 3]: Homework 3 [pdf]. Just for fun, no grading; but I'll read your submissions and give your bonus credits	Y.Y.	LI, Zhen
03/08/2019, Fri	Lecture 08: Supervised PCA -- LDA and SIR [ Chapter 1, Section 7 in new update on Mar 1, 2019 ] [Reference]: Dennis Cook, Fisher Lecture: Dimensionality Reduction in Regression. Statistical Science, 22(1):1-26, 2007. Ker-Chau Li, Sliced Inverse Regression for Dimension Reduction . Journal of the American Statistical Association, 86(414):316-327, 1991 Wu, Liang, and Mukherjee. Localized Sliced Inverse Regression. NIPS 2009. [Matlab codes] Jiang B and Liu JS. (2014) Variable selection for general index models via sliced inverse regression. Annals of Statistics, 42:1751-1786. [ R codes ] Wolfgang Hardle and Leopold Simar. Applied Multivariate Statistical Analysis. Chapter 18.3: Sliced Inverse Regression.	Y.Y.
03/13/2019, Wed	Lecture 09: Robust PCA [ Chapter 4, Section 1-4 in new update on Mar 1, 2019 ] [Reference]: You need Matlab CVX optimization toolbox to run the following demo codes. Robust PCA demo: [ testRPCA.m ] Robust PCA via ADMM in Python [ weblink ] Teng ZHANG's Tyler's M-estimator [ Matlab: tyler_m_estimator.m ] Duembgen, Nordhausen and Schuhmacher (2016): R package for M-scatter estimates [ R package fastM ]	Y.Y.
03/15/2019, Fri	Lecture 10: Sparse PCA and MDS with Uncertainty (Chap 4: 5-7)[ new update on Sep 27, 2017 ] [Reference]: You need Matlab CVX optimization toolbox to run the following demo codes. Sparse PCA demo: [ testSPCA.m ] Sparse PCA in Python [ sklearn ] Sensor Network Localization in Matlab: [ SNLSDP ] [Homework 4]: Homework 4 [pdf]. Just for fun, no grading; but I'll read and give bonus credits if you submitted.	Y.Y.
03/20/2019, Wed	Lecture 11: Project 1 and An Introduction to Libra: Linearized Bregman Algorithms in High Dimensional Statistics [ Reference ]: Project 1 [pdf] . Page 1-17: Introduction to Libra for high dimensional statistics [slides] .	Y.Y.
03/22/2019, Wed	Lecture 12: Differential Inclusion Methods in High Dimensional Statistics [ Reference ]: Page 18-60: how does it work? [slides] .	Y.Y.
03/27/2019, Wed	Lecture 13: Manifold Learning I: ISOMAP and LLE [ slides ] [Reference]: [ISOMAP]: Tenenbaum's website on science paper with datasets; [LLE]: Roweis' website on science paper; Zhang, Z. & Wang, J. MLLE: Modified Locally Linear Embedding Using Multiple Weights. [ NIPS 2006 ] Zhang, Z. & Zha, H. (2005) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal on Scientific Computing. 26 (1): 313-338. [doi:10.1137/s1064827502419154] [Python]: plot_mani_digits.ipynb : demo of digits in class scikit-learn manifold module [Matlab]: IsomapR1 : isomap codes by Tennenbaum, de Silva (isomapII.m with sparsity, fast mex with dijkstra.cpp and fibheap.h lle.m : lle with k-nearest neighbors kcenter.m : k-center algorithm to find 'landmarks' in a metric space	Y.Y.
03/29/2019, Fri	Lecture 14: Manifold Learning II: Extended LLEs (Chap 5: 3-6) [ slides ] [Homework 5]: Homework 5 [pdf]. Just for fun, no grading. [Reference]: Mikhail Belkin & Partha Niyogi, Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering, Advances in Neural Information Processing Systems （NIPS) 14, 2001, p. 586-691, MIT Press [nips link] Donoho, D. & Grimes, C. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci U S A. 100:5591 (2003). [doi: 10.1073/pnas.1031596100] R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. PNAS 102 (21):7426-7431, 2005 [doi: 10.1073/pnas.0500334102] Nadler, Boaz; Stéphane Lafon; Ronald R. Coifman; Ioannis G. Kevrekidis (2005). "Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker–Planck Operators" (PDF) in Advances in Neural Information Processing Systems (NIPS) 18, 2005. Coifman, R.R.; S. Lafon. (2006). "Diffusion maps". Applied and Computational Harmonic Analysis. 21: 5–30. 10.1016/j.acha.2006.04.006. Stochastic Neighbor Embedding [ .pdf ] Visualizing Data using t-SNE [ .pdf ] A paper that relates SNE to Laplacian Eigenmaps [ .pdf ] A helpful website: How to use t-SNE effectively? [ link ] [Matlab] Matlab code to compare manifold learning algorithms [ mani.m ] : PCA, MDS, ISOMAP, LLE, Hessian LLE, LTSA, Laplacian, Diffusion (no SNE!) [Python]: plot_compare_methods.ipynb : demo in class plot_mani_digits.ipynb : demo of digits in class scikit-learn manifold LLE : PCA/MDS, ISOMAP, LLE/MLLE, Hessian, LTSA, Laplacian (Spectral), t-SNE (no Diffusion) Laurens van der Maaten's website for t-SNE codes	Y.Y.
04/03/2019, Wed	Lecture 15: A Seminar on Robust Estimate and Generative Adversarial Networks (GANs) [ slides ] [Reference]: Chao Gao, Jiyi Liu, Yuan Yao, & Weizhi Zhu, Robust Estimate and Generative Adversarial Networks, ICLR 2019. [arXiv:1810.02030] Chao Gao, Yuan Yao, & Weizhi Zhu, Generative Adversarial Nets for Robust Scatter Estimation: A Proper Scoring Rule Perspective. [ arXiv:1903.01944 ] [Python]: Robust-GAN-Center : robust center (mean) estimate via GANs Robust-GAN-Scatter : robust scatter (covariance) estimate via GANs	Weizhi ZHU
04/10/2019, Wed	Lecture 16: Random Walk on Graphs and Spectral Graph Theory: Perron-Frobenius, Fiedler, and Cheeger Theories [Reference]: Amy N. Langville and Carl D. Meyer's book: Google's PageRank and Beyond Jim Demmel's courseweb at UC Berkeley for Fiedler Theory and Graph Bipartition: [ link ] T. Buehler, M. Hein. Spectral Clustering based on the graph p-Laplacian. roceedings of the 26th International Conference on Machine Learning (ICML 2009), 81-88. James R. Lee, Shayan Oveis Gharan, Luca Trevisan. Multi-way spectral partitioning and higher-order Cheeger inequalities. Proceeding STOC'12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing, Pages 1117-1130. arXiv:1111.1055. [Project 1 Report Repository] GitHub Repository for reports of Project 1 [ GitHub ] 01. HAN, Xu. PCA analysis and prediction on return of SNP500 dataset. [ poster ] [ source (ipynb) ][ peer review (docx) ] 02. LEI, Kang. Characters and Events Analysis of Journey to the West by PCA and SPCA. [ poster ] [ Python PCA ] [ Python SPCA ] [ peer review (docx) ] 03. LUI, Go Nam. Principle Component Analysis on Finance Data. [ poster (pdf) ] [ poster (pptx) ] [ source (zip) ] [ peer review (docx) ] 04. SHEN, Xinwei and YANG, Yunfei. Dimension reduction methods to improve image classification. [ report ] [ poster revision ] [ source ] [ peer review (docx) ] [ rebuttal (docx) ] 05. SUN, Jing and LUO, Shuang. Principal Component Analysis of Crime Data in USA. [ report ] [ source ] [ peer review (docx) ] 06. WANG, Meilan and LIU, Di. Finance Data PCA, Parallel Analysis. [ poster (pdf) ] [ poster (pptx) ] [ source (matlab) ] [ source (ipynb) ] [ peer review (docx) ] 07. Ziming WU, Feng HAN, and Song LIU. Whether and why are people feeling happy? Mining Affective Events Based on Text-based Information. [ poster ] [ source ] [ peer review (docx) ] 08. YU, Zhijie. Dream of Red Mansion Analysis. [ poster ] [ source (ipynb) ] [ peer review (docx) ] 09. CHAN, Lok Chun. Realization of Recent Trends in Machine Learning Community in Recent Years by Pattern Mining of NIPS Words. [ poster (pdf) ] [ poster (pptx) ] [ source (ipynb) ] [ peer review (docx) ] 10. LIANG, Zhicong. Exploring High Dimension Data with MDS, PCA, AutoEncoder and VAE. [ poster (pdf) ] [ source ] 11. CHENG, Wei. Dimension Reduction Visualization and Classification on Hand Writing Data. [ poster (pdf) ] [ Github link ] [ Project 1 Open Peer Review ] Description Deadline: 11:59pm April 16 2019 [ Project 1 Rebuttal ] Description Deadline: 11:59pm April 21 2019	Y.Y.
04/12/2019, Fri	Lecture 17: Random Walk on Graphs: Lumpability vs. MNcut and Transition Path Theory vs. Semi-Supervised Learning. (Chapter 6.4-6.7 and Chapter 8.4)	Y.Y.
04/17/2019, Wed	Lecture 18: From Graphs to Complexes: Combinatorial Hodge Lapacians (Chapter 9.1)	Y.Y.
04/24/2019, Wed	Lecture 19: Applied Hodge Theory: Social Choice, Crowdsourced Ranking, and Hodge Decomposition [ slides ] [ Reference ]: Statistical Ranking and Combinatorial Hodge Theory. Xiaoye Jiang, Lek-Heng Lim, Yuan Yao and Yinyu Ye. Mathematical Programming, Volume 127, Number 1, Pages 203-244, 2011. [pdf][ arxiv.org/abs/0811.1067][ Matlab Codes] Flows and Decompositions of Games: Harmonic and Potential Games Ozan Candogan, Ishai Menache, Asuman Ozdaglar, and Pablo A. Parrilo Mathematics of Operations Research, 36(3): 474 - 503, 2011 [arXiv.org/abs/1005.2405][ doi:10.1287/moor.1110.0500 ] HodgeRank on Random Graphs for Subjective Video Quality Assessment. Qianqian Xu, Qingming Huang, Tingting Jiang, Bowei Yan, Weisi Lin, and Yuan Yao. IEEE Transactions on Multimedia, 14(3):844-857, 2012 [pdf][ Matlab codes in zip ] Robust Evaluation for Quality of Experience in Crowdsourcing. Qianqian Xu, Jiechao Xiong, Qingming Huang, and Yuan Yao ACM Multimedia 2013. [pdf] Online HodgeRank on Random Graphs for Crowdsourceable QoE Evaluation. Qianqian Xu, Jiechao Xiong, Qingming Huang, and Yuan Yao IEEE Transactions on Multimedia, 16(2):373-386, Feb. 2014. [pdf] Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs Braxton Osting, Jiechao Xiong, Qianqian Xu, and Yuan Yao Applied and Computational Harmonic Analysis, 41 (2): 540-560, 2016 [ arXiv:1503.00164 ] [ ACHA online ] [Matlab codes to reproduce our results] False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan Yao Proceedings of The 33rd International Conference on Machine Learning (ICML), New York, June 19-24, 2016. [ arXiv:1605.05860 ] [ pdf ] [ supplementary ] Parsimonious Mixed-Effects HodgeRank for Crowdsourced Preference Aggregation Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan Yao ACM Multimedia Conference (ACMMM), Amsterdam, Netherlands, October 15-19, 2016. [ arXiv:1607.03401 ] [ pdf ] HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, Yuan Yao The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018. [ arXiv:1711.05957 ] [ Matlab Source Codes ] From Social to Individuals: a Parsimonious Path of Multi-level Models for Crowdsourced Preference Aggregation Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan Yao IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(4):844-856, 2019. Extended from MM'16 in [ arXiv:1607.03401 ]. [ arXiv:1804.11177 ] [ doi: 10.1109/TPAMI.2018.2817205 ][ GitHub source] Professor Don Saari: [ UCI homepage ] [ Book Info: Disposing Dictators, Demstifying Voting Paradoxes ] [ Amazon link ]	Y.Y.
04/26/2017, Fri	Lecture 20: Introduction to Topological Data Analysis: Reeb Graph, Mapper, and Persistent Homology [ slides ] [Reference]: Topological Methods for Exploring Low-density States in Biomolecular Folding Pathways. Yuan Yao, Jian Sun, Xuhui Huang, Gregory Bowman, Gurjeet Singh, Michael Lesnick, Vijay Pande, Leonidas Guibas and Gunnar Carlsson. J. Chem. Phys. 130, 144115 (2009). [pdf][Online Publication][SimTK Link: Data and Mapper Matlab Codes] [Selected by Virtual Journal of Biological Physics Research, 04/15/2009]. Structural insight into RNA hairpin folding intermediates. Bowman, Gregory R., Xuhui Huang, Yuan Yao, Jian Sun, Gunnar Carlsson, Leonidas Guibas and Vijay Pande. Journal of American Chemistry Society, 2008, 130 (30): 9676-9678. [link] Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Abbas H Rizvi, Pablo G Camara, Elena K Kandror, Thomas J Roberts, Ira Schieren, Tom Maniatis & Raul Rabadan. Nature Biotechnology. 2017 May. doi:10.1038/nbt.3854 Spatiotemporal genomic architecture informs precision oncology in glioblastoma. Lee JK, Wang J, Sa JK, Ladewig E, Lee HO, Lee IH, Kang HJ, Rosenbloom DS, Camara PG, Liu Z, van Nieuwenhuizen P, Jung SW, Choi SW, Kim J, Chen A, Kim KT, Shin S, Seo YJ, Oh JM, Shin YJ, Park CK, Kong DS, Seol HJ, Blumberg A, Lee JI, Iavarone A, Park WY, Rabadan R, Nam DH. Nat Genet. 2017 Apr. doi: 10.1038/ng.3806. A Python Implementation of Mapper [ sakmapper ] in single cell data analysis. Single Cell TDA [ scTDA ] with [ tutorial in html ] A Java package for persistent homology and barcodes: Javaplex Tutorial. Guo-Wei Wei, Persistent Homology Analysis of Biomolecular Data, SIAM News 2017 Topological Data Analysis Generates High-Resolution, Genome-wide Maps of Human Recombination. Pablo G. Camara, Daniel I.S. Rosenbloom, Kevin J. Emmett, Arnold J. Levine, Raul Rabadan. Cell Systems. 2016 June. doi: 10.1016/j.cels.2016.05.008. Topology of viral evolution. Chan JM, Carlsson G, Rabadan R. Proc Natl Acad Sci USA 2013 Oct 29. doi: 10.1073/pnas.1313480110. Robert Ghrist's monograph on applied Topology Elementary Applied Topology	Y.Y.
05/03/2019, Fri	Lecture 21: Final Project. [ project2.pdf ] [ Project 2 Report Repository ] GitHub Repository for reports of Project 2 [ GitHub ] 01. KANG, Lei. Order the faces by Diffusion Map, ISOMAP and LLE. [ poster ] [ source (.py) ] [ presentation video ] [ peer review ] 02. LIANG, Zhicong. Finding Trend in Stock Market with RobustPCA. [ poster (pdf) ] [ slides (pptx) ] [ presentation video ] [ peer review ] 03. LIU, Di, Meilan WANG, Xu HAN. [ Best Writing Award! ][ Best Presentation Award! ] Order the Faces via Manifold Learning. [ report (pdf) ] [ slides (pptx) ] [ presentation video ] [ source ] [ peer review ] 04. LUI, Go Nam. Human age ranking from pairwise comparison data via HodgeRank. [ poster (pdf) ] [ slides (pptx) ] [ presentation video ] [ peer review ] 05. SHEN, Xinwei and YANG, Yunfei. [ Best Creativity Award!! ] Representation learning on gene expression data. [ report (pdf) ] [ slides (pdf) ] [ source ] [ presentation video ] [ peer review ] 06. SUN, Jing, Shuang LUO, Zhijie YU. Dimensionality Reduction of Face Order ProblemUsing Nonlinear Embedding Methods. [ poster (pdf) ] [ slides (pptx) ] [ presentation video ] [ source ] [ peer review ] 07. WU, Ziming, Feng HAN, Song LIU. Whether and why are people feeling happy? Multi-Task Mining Based on Text-based Information. [ poster (pdf) ] [ slides (pdf) ] [ presentation video ] [ source ] [ peer review ] 08. CHAN, Lok Chun. Human Age Ranking Using Hodge Rank. [ poster (pdf) ] [ poster (pptx) ] [ presentation video ] [ source ] [ peer review ] 09. CHENG, Wei. Spectral Clustering and Transition Paths Analysis of Karate Club Network. [ poster (pdf) ] [ slides (pdf) ] [ source ] [ presentation video ] [ peer review ]	Y.Y.
05/08/2019, Wed	Seminar: Learning Deep Generative Models via Variational Gradient Flow and Its Applications [ Invited Speaker ]: Prof. Can Yang, Department of Mathematics, The Hong Kong University of Science and Technology [ Abstract ]: Learning the generative model, i.e., the underlying data generating distribution, based on large amounts of data is one of the fundamental tasks in machine learning and statistics. Recent progresses in deep generative models have provided novel techniques for unsupervised and semi-supervised learning, with broad application varying from image synthesis, semantic image editing, image-to-image translation to low-level image processing. However, statistical understanding of deep generative models is still lacking, e.g., why the logD trick works well in training generative adversarial networks (GAN). In this talk, we introduce a general framework, variational gradient flow (VGrow), to learn a deep generative model to sample from the target distribution via combing the strengths of variational gradient flow on probability space, particle optimization and deep neural network. The proposed framework is applied to minimize the f-divergence between the evolving distribution and the target distribution. We prove that the particles driven by VGrow are guaranteed to converge to the target distribution asymptotically. Connections of our proposed VGrow method with other popular methods, such as VAE, GAN and flow-based methods, have been established in this framework, gaining new insights of deep generative learning. We also evaluated several commonly used f-divergences, including Kullback-Leibler, Jensen-Shannon, Jeffrey divergences as well as our newly discovered “logD” divergence which serves as the objective function of the logD-trick GAN. Besides the above theoretical understanding, we emphasize the practical issues in training GAN. Through a systematic design of the generator and the discriminator, much of the efforts on parameter tuning can be avoided. Using a pre-defined network structure rather than case-by-case parameter tuning, VGrow can generate high-fidelity images in a stable and efficient manner. Its results on those benchmark data sets (e.g., CIFAR10, CelebA) show Its competitive performance with state-of-the-art GANs. We have also applied VGrow to the portrait data from The Wikipedia Art Project, generating realistic portraits without extra editing. This is a joint work with Yuan Gao, Yuling Jiao, Yao Wang, Gefei Wang, Yang Wang and Shunkang Zhang. [ slides ] [ GitHub source ] [ arXiv:1901.08469 ]	Can Yang

Datasets (to-be-updated)

[Animal Sleep Data] Animal species sleeping hours vs. other features

[Anzhen Heart Data] Heart Operation Effect Prediction, provided by Dr. Jinwen Wang, Anzhen Hospital

[Beer Data] 877 beers dataset , provided by Mr. Richard Sun, Shanghai

[Crime Data] Crime rates in 59 US cities during 1970-1992

[Real-Time-Bidding Algorithm Competition Data] Contest Website

[红楼梦人物事件矩阵] a 376-by-475 matrix (374-by-475 updated by WAN, Mengting) for character-event appearance in A Dream of Red Mansion (Xueqin Cao) [ Dataset in Github ] [374 Characters dream.RData (for R load)] [dream.Rd (for R manual)] [HongLouMeng374.txt] [HongLouMeng376.csv] [.mat] [readme.m]

[西游记] characters-scene occurance matrices for 100 chapters [ Dataset in GitHub ] [data in RData] [data in matlab (302-by-408 matrix)]

chap001-005	chap006-009	chap010-013	chap014-017	chap018-021	chap022-025
chap026-029	chap030-033	chap034-037	chap038-041	chap042-045	chap046-049
chap050-053	chap054-057	chap058-061	chap062-065	chap066-069	chap070-073
chap074-077	chap078-081	chap082-085	chap086-088	chap089-091	chap092-094
chap095-097	chap098-100	All in TXT	readData.m

[Keywords Pricing] Keywords and profit index in paid search advertising, by Hansheng Wang (Guanghua, PKU). [sample file] [readme.txt] [data in csv]

[Radon Data] Radon measurements of 12,687 houses in US

[Wells Data] Switch unsafe wells for arsenic pollution in Bangladesh

to-be-done...

by YAO, Yuan.