Skip to content

bmoretz/MS-DataScience

Repository files navigation

Portfolio of course work for my Master's in Data Science.

alt text

Program Overview:

The integration of data science and business strategy has created a demand for professionals who can make data-driven decisions that propel their organizations forward. You can build the essential analysis and leadership skills needed for careers in today's data-driven world in Northwestern’s online Master of Science in Data Science program.

MSDS students gain critical skills for succeeding in today's data-intensive world. They learn how to utilize relational and document database systems and analytics software built upon open-source systems such as R, Python, and TensorFlow. They learn how to make trustworthy predictions using traditional statistics and machine learning methods.

Choose a general data science track or one of four specializations: Analytics and Modeling, Artificial Intelligence, Data Engineering, or Analytics Management. The specializations are designed to foster individual career growth based on your professional goals. You can further customize your studies with a wide range of elective courses, including financial and risk analytics, artificial intelligence and deep learning, analytics systems analysis, and information retrieval and real-time analytics.

For more information, please visit the Northwestern Master's in Data Science page. My specialization is in analytics and modeling, a brief overview is provided below.

Analytics and Modeling Specialization

In the world of data science, the analysts and modelers specialize in testing real-world predictions about data. Data analysts and modelers conduct research and take complex factors into account to build predictive models and create forecasts upon which data-driven decisions can be made. With a focus on traditional methods of applied statistics, this specialization prepares data scientists to utilize algorithms for predictive modeling and analytics, developing models for marketing, finance, and other business applications.

MSDS 410 - Data Modeling for Supervised Learning

This course introduces traditional statistics and data modeling for supervised learning problems, as employed in observational and experimental research. With supervised learning there is a clear distinction between explanatory and response variables. The objective is to predict responses, whether they be quantitative as with multiple regression or categorical as with logistic regression and multinomial logit models. Students work on research and programming assignments, exploring data, identifying appropriate models, and validating models. They utilize techniques for observational and experimental research design, data visualization, variable transformation, model diagnostics, and model selection.  

MSDS 411 - Data Modeling for Unsupervised Learning

This course introduces data modeling for studies in which there is no clear distinction between explanatory and response variables. The objective may be to explain relationships among many continuous variables in terms of underlying dimensions, latent variables, or factors, as with principal components and factor analysis. The objective may be to find a lower-dimensional representation for multivariate cross-classified data, as with log-linear models. The objective may be to construct a visualization of variables or objects, as with traditional multidimensional scaling and t-distributed stochastic neighbor embedding. Or the objective may be to identify groups of variables and/or objects that are similar to one another, as with cluster analysis and biclustering. Students work on research and programming assignments, exploring multivariate data and methods.

MSDS 432 - Foundations of Data Engineering

This course provides an overview of the discipline of data engineering. It introduces software and systems for data science and software development as required in the design of data-intensive applications. Students learn about algorithms, data structures, and technologies or storing and processing data. Students gain experience with open-source software, text editors, and integrated development environments. Students employ best practices in software development, utilizing tools for syntax checking, testing, debugging, and version control. The course also introduces formal models, simulations, and benchmark experiments for evaluating software, systems, and processes.

MSDS 451 - Financial and Risk Analytics

Building upon probability theory and inferential statistics, this course provides an introduction to risk analytics. Examples from economics and finance show how to incorporate risk within regression and time series models. Monte Carlo simulation is used to demonstrate how variability in data affects uncertainty about model parameters. Additional topics include subjectivity in risk analysis, causal modeling, stochastic optimization, portfolio analysis, and risk model evaluation.

Capstone

Quantum Capital

The main objective of the project is to model the price movements and forecasts of energy commodities, including gasoline, heating oil, crude (WTI and Brent) and natural gas. We will be developing a proprietary data source derived from publicly available information published to the primary commodity exchanges. Using this bespoke dataset, our research efforts will be focused on using our domain-knowledge as well as our analytical and quantitative technical capabilities to develop predictive models as they relate to understanding the financial markets for energy commodities.