ORIE 6340 Mathematics of Data Science
Announcements
Instructor Information
instructor: Damek Davis
office hours: M 1:30PM-2:30PM, and by appointment
office: Rhodes Hall 218
email: dsd95 at cornell.edu
teaching assistant: Mateo Diaz
office hours: W 4-5 PM
email: md825 at cornell.edu
Ed Discussions: See canvas
Meeting Times and Location
lecture time: Monday and Wednesday 11:25am - 12:40pm
lecture location: Zoom (see canvas for links)
Course Description
This course is an introduction to an emerging research area broadly described as “Math of Data Science.” This area is highly interdisciplinary, so acquiring the tools necessary to participate is usually an overwhelming, and unsystematic process. ORIE 6340 is an attempt to overcome the current state of affairs.
The topics of the course will include:
Concentration of measure phenomena for random vectors and matrices (e.g., subGaussian vectors; McDiarmid; Lipschitz functions; empirical processes; Rademacher complexity);
Estimation in high dimensions:
Convex Relaxations and Spectral Methods (e.g., SDPs; stochastic block model; max cut; compressive sensing)
Direct nonconvex optimization methods (e.g., first-order methods; low-rank matrix estimation: matrix sensing and completion)
Don't let the outline fool you: this is a lot of material. Much of the course will be based on the excellent lecture notes of Bandeira-Singer-Strohmer. Throughout the semester, I will augment these notes with alternative readings (research papers/textbooks) that I find useful (see Resources below). Depending on how quickly we cover the material, we will transition to current research topics as we progress through the course.
Resources
I will assume working knowledge of linear algebra and probability, optimization, and algorithms. I will review necessary facts from optimization and probability, but the more you know about these topics, the better you will be prepared. To that end, you might make use of the following textbooks.
Requirements and Grading
Grading Component: The grade will be based on two components:
(40%) There will be (approximately) three homework assignments (To be uploaded).
(60%) There will be a final project (completed individually or in groups of two), which may either be a literature review or a research project based on topics similar to those mentioned in the course. Ideally, projects should be highly correlated with your own research interests.
Initial Project Proposal: Due Wednesday March 13th
Final Report: Due Monday May 6th
Presentation: Last Week of Class
Collaboration
Cornell’s Code of Academic Integrity can be found at cuinfo.cornell.edu/Academic/AIC.html.
You may work together on problem sets, but you must write up your own solutions AND acknowledge those with whom you discussed the problem. You must also cite any resources which helped you obtain your solutions.
Problem Sets
Lectures
|