MATH - BIOINF - STATS 547: Mathematics of Data
Instructor: Prof. INDIKA Rajapakse (email@example.com)
Teaching Assistant: COOPER Stansbury (firstname.lastname@example.org)
Class Time: Tuesday and Thursday, 2:30 - 4:00PM in EH B844: https://meet.google.com/dnm-zipr-tsb
Recorded Lectures: Available on Canvas only
Wednesday and Friday (Indika), 4:00 - 5:00PM: https://meet.google.com/dnm-zipr-tsb or in person after class
Monday (Cooper) from 2:00 - 3:00PM: meet.google.com/ivh-fege-xby
Timeline: Link to Timeline
Some ideas on Simplicity, Rigor, and Magic!
Digital Library: An unofficial digital library I maintain that contains books and papers on topics related to my research and teaching
Most Common Optimization Strategies
Great book with codes: Cleve Moler . Numerical computing with MATLAB. Society for Industrial and Applied Mathematics, (2004)
Great Blogs! I visit these from time to time, and they have many great articles
Final Class Day Resources:
This section includes assignments, solutions, and helpful resources.
Problem Set 2: Due 02/14/2022 on Canvas (Due date extended from 02/10)
A Tutorial on Spectral Clustering (Excellent Review!)
Paper: Surana A, Chen C, Rajapakse I. "Hypergraph Similarity Measures." arXiv preprint arXiv:2106.08206 (2021)
Problem Set 3: Due 02/25/2022
Problem Set 4: Due 03/24/2022
Final Project: Due 04/28/2022
Final Project Reviews: Due 04/14/2022
Problem of the Day
Some recordings are incomplete, but the class notes contain all content presented, so please use them together.
NOTES BY SUBJECT
This section contains class notes as well as helpful resources, including classic papers, grouped by topic. Please browse this section, and keep it in mind in your future work beyond this course.
Guest Lecture: Dr. Cleve Moler
Moler C. Professor SVD. The MathWorks News & Notes. 2006.
Lieberman-Aiden, Erez, ..., Groudine Mark, ..., Lander Eric. "Comprehensive mapping of long-range interactions reveals folding principles of the human genome." Science 326.5950 (2009): 289-293.
Martin CD, Porter MA. The extraordinary SVD. The American Mathematical Monthly. 2012 Dec 1;119(10):838-51. (Excellent Review!)
Eckhart-Young Theorem and Low Rank Approximations
Proof (Book: Golub and Van Loan, Matrix computations )
Udell, Madeleine, and Alex Townsend. "Why are big data matrices approximately low rank?." SIAM Journal on Mathematics of Data Science 1.1 (2019): 144-160.
Turk, Matthew, and Alex Pentland. "Eigenfaces for recognition." Journal of cognitive neuroscience 3.1 (1991): 71-86. (Classic! just browse)
Kobak, Dmitry, and George C. Linderman. "Initialization is critical for preserving global data structure in both t-SNE and UMAP." Nature Biotechnology (2021): 1-2.
McInnes, Leland, John Healy, and James Melville. "Umap: Uniform manifold approximation and projection for dimension reduction." arXiv preprint arXiv:1802.03426 (2018).
Wang Y, Huang H, Rudin C, Shaposhnik Y. Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMAP, and PaCMAP for data visualization. arXiv preprint arXiv:2012.04456. 2020 Dec 8.
Remark: This is a really nice review in general, but the most relevant parts are pages 9 and 10
Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of machine learning research 9.11 (2008).
Ng, Andrew Y., Michael I. Jordan, and Yair Weiss. "On spectral clustering: Analysis and an algorithm." Advances in neural information processing systems 2 (2002): 849-856.
Belkin, Mikhail, and Partha Niyogi. "Laplacian eigenmaps and spectral techniques for embedding and clustering." Advances in neural information processing systems. 2002.
Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4 (2007): 395-416. (Excellent Review!)
Cheeger J. A lower bound for the smallest eigenvalue of the Laplacian. In Problems in analysis 2015 Mar 8 (pp. 195-200). Princeton University Press.
The Turing System
Turing, Alan. "The chemical basis of morphogenesis." Philosophical Transactions of the Royal Society of London B. 237 (1952): 641
Rajapakse I, and Smale S. "Emergence of Function from Coordinated Cells in a Tissue." Proceedings of the National Academy of Sciences 114.7 (2017): 1462-1467.
Dynamic Mode Decomposition (DMD)
Kutz, J. Nathan, et al. Dynamic mode decomposition: data-driven modeling of complex systems. Society for Industrial and Applied Mathematics, 2016.
Chapter 1: Dynamic Mode Decomposition: An Introduction
Hirsh, Seth M., et al. "Centering data improves the dynamic mode decomposition." SIAM Journal on Applied Dynamical Systems 19.3 (2020): 1920-1955.
T. Askham, P. Zheng, A. Aravkin, and J. N. Kutz, “Robust and Scalable Methods for the Dynamic Mode Decomposition,” SIAM J. Appl. Dyn. Syst., vol. 21, no. 1, pp. 60–79, Mar. 2022
P. J. Baddoo, B. Herrmann, B. J. McKeon, J. N. Kutz, and S. L. Brunton, “Physics-informed dynamic mode decomposition (piDMD),” Dec. 2021
Z. Bai, E. Kaiser, J. L. Proctor, J. N. Kutz, and S. L. Brunton, “Dynamic Mode Decomposition for Compressive System Identification,” AIAA Journal, vol. 58, no. 2, pp. 561–574, 2020
J. J. Bramburger, D. Dylewsky, and J. N. Kutz, “Sparse identification of slow timescale dynamics,” Phys. Rev. E, vol. 102, no. 2, p. 022204, Aug. 2020
S. L. Brunton and J. N. Kutz, “Methods for data-driven multiscale model discovery for materials,” J. Phys. Mater., vol. 2, no. 4, p. 044002, Jul. 2019
C. W. Curtis and D. J. Alford-Lago, “Dynamic-mode decomposition and optimal prediction,” Phys. Rev. E, vol. 103, no. 1, p. 012201, Jan. 2021
U. Fasel, E. Kaiser, J. N. Kutz, B. W. Brunton, and S. L. Brunton, “SINDy with Control: A Tutorial,” in 2021 60th IEEE Conference on Decision and Control (CDC), Dec. 2021
M. R. Jovanović, P. J. Schmid, and J. W. Nichols, “Sparsity-promoting dynamic mode decomposition,” Physics of Fluids, vol. 26, no. 2, p. 024103, Feb. 2014
N. M. M. Kalimullah, A. Shelke, and A. Habib, “Multiresolution Dynamic Mode Decomposition (mrDMD) of Elastic Waves for Damage Localisation in Piezoelectric Ceramic,” IEEE Access, vol. 9, pp. 120512–120524, 2021
S. Klus, P. Gelß, S. Peitz, and C. Schütte, “Tensor-based dynamic mode decomposition,” Nonlinearity, vol. 31, no. 7, pp. 3359–3380, Jun. 2018
J. N. Kutz, X. Fu, S. L. Brunton, and N. B. Erichson, “Multi-resolution Dynamic Mode Decomposition for Foreground/Background Separation and Object Tracking,” in 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Dec. 2015
J. N. Kutz, X. Fu, and S. L. Brunton, “Multiresolution Dynamic Mode Decomposition,” SIAM J. Appl. Dyn. Syst., vol. 15, no. 2, pp. 713–735, Jan. 2016
S. Le Clainche and J. M. Vega, “Higher Order Dynamic Mode Decomposition,” SIAM J. Appl. Dyn. Syst., vol. 16, no. 2, pp. 882–925, Jan. 2017
I. Mezić, “Spectral Properties of Dynamical Systems, Model Reduction and Decompositions,” Nonlinear Dyn, vol. 41, no. 1, pp. 309–325, Aug. 2005
P. J. Schmid, “Dynamic mode decomposition of numerical and experimental data,” Journal of Fluid Mechanics, vol. 656, pp. 5–28, Aug. 2010
J. H. Tu, C. W. Rowley, D. M. Luchtenburg, S. L. Brunton, and J. N. Kutz, “On Dynamic Mode Decomposition: Theory and Applications,” Journal of Computational Dynamics, vol. 1, no. 2, pp. 391–421, 2014
Mezic I. Koopman operator, geometry, and learning. arXiv preprint arXiv:2010.05377. 2020 Oct 12.
Champion K, Lusch B, Kutz JN, Brunton SL. "Data-driven discovery of coordinates and governing equations." Proceedings of the National Academy of Sciences. 2019 Nov 5;116(45):22445-51.
Surana A. Koopman operator framework for time series modeling and analysis. Journal of Nonlinear Science. 2020 Oct;30(5):1973-2006.
Video: Koopman Operator Theory for Dynamical Systems, Control and Data Analytics by Prof. Igor Mezic
Data Guided Control (DGC)
Ronquist S, Patterson G, Muir LA, Lindsly S, Chen H, Brown M, Wicha M, Bloch A, Brockett R and Rajapakse I. "Algorithm for Cellular Reprogramming." Proceedings of the National Academy of Sciences 114.45 (2017): 11832-11837.
Moore B. Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE transactions on automatic control. 1981 Feb;26(1):17-32. (Amazing Paper!)
Bryan, Kurt, and Tanya Leise. "The $25,000,000,000 eigenvector: The linear algebra behind Google." SIAM review 48.3 (2006): 569-581.
Causal Discovery from Data: Dr. David Heckerman
Matrix Completion and CVX
Candès, Emmanuel J., et al. "Robust principal component analysis?." Journal of the ACM (JACM) 58.3 (2011): 1-37.
Additional Slides: AFOSR 2021 slides
Hypergraph Visualization: PAOHvis
Eldén, Lars. Matrix methods in data mining and pattern recognition. Society for Industrial and Applied Mathematics, 2007.
Kolda, Tamara G., and Brett W. Bader. "Tensor decompositions and applications." SIAM review 51.3 (2009): 455-500. (Excellent Review!)
Chen C, Rajapakse I. "Tensor Entropy for Uniform Hypergraphs." IEEE Transactions on Network Science and Engineering 7.4 (2020): 2889-2900.
Chen C, Surana A, Bloch A, Rajapakse I. "Controllability of Hypergraphs." IEEE Transactions on Network Science and Engineering, Accepted , 2021 Mar 23(2021).
Benson, Austin R., David F. Gleich, and Desmond J. Higham. "Higher-order Network Analysis Takes Off, Fueled by Classical Ideas and New Data." arXiv preprint arXiv:2103.05031 (2021). (Nice Read!)
Williams, Alex H., et al. "Unsupervised discovery of demixed, low-dimensional neural dynamics across multiple timescales through tensor component analysis." Neuron 98.6 (2018): 1099-1115.
Wolf, Michael M., Alicia M. Klinvex, and Daniel M. Dunlavy. "Advantages to modeling relational data using hypergraphs versus graphs." 2016 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2016.
Zhou Y, Rathore A, Purvine E, Wang B. Topological Simplifications of Hypergraphs. arXiv preprint arXiv:2104.11214. 2021 Apr 22.
Lee D, Seung HS. Algorithms for non-negative matrix factorization. Advances in neural information processing systems. 2000;13.
Lee, Daniel D., and H. Sebastian Seung. "Learning the parts of objects by non-negative matrix factorization." Nature 401.6755 (1999): 788-791.
Cichocki A, Zdunek R, Phan AH, Amari SI. Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley & Sons; 2009 Jul 10.
MATLAB Image Decomposition Examples
Topological Data Analysis
A Taste of Topology: Book Chapter: Charles Pugh. Real mathematical analysis. (Excellent Exposition!)
Introduction: These slides deck includes slides from my good friend Yuan Yao
Austin Benson Datasets (Hypergraph Data)
Metabolite (Hypergraph Data)
COVID data: Slides from Dr. Dennis Chao from Gates Foundation
I will add to this list throughout the semester
Donoho D. "50 years of data science." Journal of Computational and Graphical Statistics. 2017 Oct 2;26(4):745-66.
Rajapakse, Indika. "Conversation with Dr. Steve Smale and Dr. Lee Hartwell." NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 68, no. 9.
Smale, Steve. "Mathematical problems for the next century." The mathematical intelligencer 20.2 (1998): 7-15.
Kolda T. Mathematics: The Tao of Data Science. (2020).