Data Science is often viewed as the confluence of (1) Computer and Information Sciences (2) Statistical Sciences, and (3) Domain Expertise. These three pillars are not symmetric: the first two together represent the core methodologies and the techniques used in Data Science, while the third pillar is the application domain to which this methodology is applied. In this program, core data science training is focused on the first two pillars, along with practice in applying their skills to address problems in application domains.
We characterize the required Data Science skills in two categories: statistical skills, such as those taught by the Statistics and Biostatistics departments, and computational skills, such as those taught by the Computer Science and Engineering Division and the School of Information. The design of the program is to require every student to receive balanced training in both areas. To create an academic plan that achieves this balance, and to foster a greater sense of shared community, we do not intend to offer any sub-plans or tracks within the proposed degree program. Rather, we will expect graduates of this program to understand data representation and analysis at an advanced level.
With the MS in Data Science all students will be able to:
- identify relevant datasets
- apply the appropriate statistical and computational tools to the dataset to answer questions posed by individuals, organizations or governmental agencies
- design and evaluate analytical procedures appropriate to the data
- implement these efficiently over large heterogeneous data sets in a multi-computer environment
Prerequisites
Our diverse community of graduate students comes from many different countries and many undergraduate majors, including statistics, mathematics, computer science, physics, engineering, information, and data science. While a Data Science undergraduate major is not required, it is expected that applicants will have at least the following background before they join:
- 2 semesters of college calculus
- 1 semester of linear or matrix algebra
- 1 introduction to computing course
MDS Program Curriculum
Downloadable pdf version of the Master’s of Data Science (MDS) program curriculum: MDS Program Guide
You can review the University of Michigan’s course offerings regardless of term on the LSA Graduate Course Catalog, and you can search for more details about each course on Atlas.
Students must take the following core courses (unless waived by the course review process, to be determined after matriculation):
MATH 403: Introduction to Discrete Mathematics (First Fall Semester)
EECS 402: Programming for Scientists and Engineers (First Fall Semester)
EECS 403: Graduate Foundations of Data Structures and Algorithms (Offered only in the Fall semester; Prerequisites: MATH 403 & EECS 402)
1 of the following
- BIOSTATS 601: Probability and Distribution Theory
- MATH/STATS 425: Introduction to Probability
- STATS 510: Probability and Distribution
1 of the following
- BIOSTATS 602: Biostatistical Inference
- STATS 426: Introduction to Theoretical Statistics
- STATS 511: Statistical Inference
Expertise in Data Management and Manipulation
1 of the following
- EECS 484: Database Management Systems
- CSE 584: Advanced Database Systems
1 of the following
- EECS 485: Web Systems (available to MDS students in Spring term only)
- EECS 486: Information Retrieval and Web Search
- CSE 549/SI 650: Information Retrieval
- SI 618: Data Manipulation Analysis
- STATS 507: Data Science Analytics using Python
Expertise in Data Science Techniques
1 of the following:
- BIOSTAT 650: Applied Statistics I: Linear Regression
- STATS 500: Statistical Learning I: Linear Regression
- STATS 513: Regression and Data Analysis
1 of the following:
- DATASCI 415: Data Mining and Statistical Learning
- STATS 503: Statistical Learning II: Multivariate Analysis
- EECS 545: Machine Learning (CSE)
- EECS 553: Machine Learning (ECE)
- EECS 476: Data Mining
- CSE 576: Advanced Data Mining
- SI 670: Applied Machine Learning
- SI 671: Data Mining: Methods and Applications
- BIOSTAT 626: Machine Learning for Health Sciences
Capstone
* Please refer to the MDS Capstone Guidelines for details.
- STATS 504: Practice and Communication in Applied Statistics
- STATS 750: Directed Reading
- CSE 599: Directed Study
- SI 691: Independent Study
- SI 699-xx5 Big Data Analytics
- BIOSTAT 610: Reading in Biostatistics
- BIOSTAT 698: Modern Statistical Methods in Epidemiologic Studies
- BIOSTAT 699: Analysis of Biostatistical Investigations
Electives
Select 1 course of at least 3 credits from each group. Electives must include at least 2 advanced graduate courses (500 level or above in LSA, UMSI, and CoE, or 600 level or above in SPH). CSE 598 Special Topics will have specific sections approved on a semesterly basis according to their category.
Principles of Data Science
BIOSTAT 601 (Probability and Distribution Theory) | BIOSTAT 602 (Biostatistical Inference) | BIOSTAT 617 (Sample Design) | BIOSTAT 626 (Machine Learning Methods) | BIOSTAT 680 (Stochastic Processes) | BIOSTAT 682 (Bayesian Analysis) | ECE 501 (Probability and Random Processes) | ECE 502 (Stochastic Processes) | EECS 545 (Machine Learning (CSE)) | ECE 551 (Matrix Methods for Signal Processing, Data Analysis, and Machine Learning) | EECS 553 (Machine Learning (ECE)) | ECE 559 (Optimization Methods for SIPML) | ECE 564 (Estimation, Filtering, and Detection) | SI 670 (Applied Machine Learning) | DATASCI 451 (Introduction to Bayesian Data Analysis) | STATS 470 (Introduction to Design of Experiments) | STATS 510 (Probability and Distribution Theory) | STATS 511 (Statistical Inference) | STATS 551 (Bayesian Modeling and Computation)
Data Analysis
BIOSTAT 651 (Generalized Linear Models) | BIOSTAT 653 (Longitudinal Analysis) | BIOSTAT 666 (Statistical Models and Numerical Methods in Human Genetics) | BIOSTAT 675 (Survival Time Analysis) | BIOSTAT 685/STATS 560 (Non-Parametric Statistics) | BIOSTAT 695 (Categorical Data) | BIOSTAT 696 (Spatial Statistics) | ECE 556 (Image Processing) | STATS 414 (Topics in Applied Data Analysis) | STATS 501 (Applied Statistics II) | STATS 503 (Statistical Learning II: Multivariate Analysis) | STATS 509 (Statistics for Financial Data) | STATS 531 (Analysis of Time Series) | STATS 600 (Linear Models) | STATS 601 (Analysis of Multivariate and Categorical Data) | STATS 605 (Advanced Topics in Modeling and Data Analysis) | STATS 700 (Topics in Applied Statistics)
Computation
BIOSTAT 615 (Statistical Computing) | BIOSTATS 625 (Computing with Big Data) | EECS 481 (Software Engineering) | EECS 485 (Web Systems) | EECS 486 (Information Retrieval and Web Search) | EECS 504 (Computer Vision) | EECS 542 (Advanced Topics in Computer Vision) | CSE 548/SI 649 (Information Realization) | CSE 549/SI 650 (Information Retrieval) | CSE 572 (Randomness and Computation) | CSE 586 (Design and Analysis of Algorithms) | CSE 587 (Parallel Computing) | CSE 592 (Artificial Intelligence) | CSE 595/SI 561 (Natural Language Processing) | SI 608 (Networks) | SI 618 (Data Manipulation and Analysis) | SI 630 (Natural Language Processing: Algorithms and People) | SI 664 (Database Application Design) | SI 671 (Data Mining: Methods and Applications) | DATASCI 406 (Computational Methods in Statistics and Data Science) | STATS 506 (Computational Methods and Tools in Statistics) | STATS 507 (Data Science Analytics using Python) | STATS 551 (Bayesian Modeling and Computation) | STATS 606 (Computation and Optimization Methods in Statistics)
Program Notes
- The cumulative GPA must be B (3.0) or better, as required by Rackham Graduate School.
- At least 25 units of graduate-level coursework (from above requirements) must be completed during residency in the Data Science program. Of these 25, 18 must be at the advanced graduate level (500 level or above in LSA, UMSI, and CoE, and 600 level or above in SPH).
- Each course cannot satisfy more than 1 requirement.
- Program requirements on page 1 (courses listed before the Capstone section) may be fulfilled by having taken approved equivalent classes in prior education with grades B- or better. The waiver applications are typically considered before the start of the program.
- MATH 403 can be fulfilled by EECS 203 if taken before program start.
- EECS 402 can be fulfilled by EECS 280 if taken before program start.
- EECS 403 can be fulfilled by EECS 281 if taken before program start.
- Expertise in Data Science Techniques part 1 can be fulfilled by STATS 413 if taken before program start.
- Expertise in Data Science Techniques part 2 can be fulfilled by EECS 445 if taken before program start.
Interested in applying? Read through the information about the Master's in Data Science application and our Frequently Asked Questions (FAQs). Please also review Rackham Graduate School's minimum requirements to apply, submitting tests, and if relevant the Required Academic Credentials from Non-U.S. Institutions.