Skip to content

WalePhenomenon/MathsForML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mathematics for Data Science and Machine Learning

This repo is created for a workshop on mathematics for data science and machine learning. As the facilitator for this workshop, I will upload the workshop materials (slide deck) and code book here. I will also include prerequisites and materials to download (if any). See below for the abstract and content of the workshop.

Abstract:

The field of machine learning and data science has gained sudden resurgence in the last few years. The contributions of machine learning in solving data-driven problems and creating intelligent applications cannot be overemphasized. This field which intersects statistics and probability, mathematics, computer science and algorithms can be used to learn iteratively from complex data and find hidden insights. Understanding the mathematics behind machine learning allows us to choose the right algorithms for our problem, make good choices on parameter settings and validation strategies, recognize under- and over-fitting, troubleshoot ambiguous results and put appropriate confidence bounds on results.

By completing this workshop, you will develop an understanding of some of the most important mathematical concepts in machine learning and data science, and how useful they are in practice. You will familiarize yourself with using multivariate calculus to understand the foundations of feedforward neural networks, the linear algebra concepts behind dimensionality reduction, how the maximum likelihood estimation can be used to derive machine learning cost functions and the building blocks of continuous optimizations.

Outline

  • Lesson 1: Multivariate Calculus and Neural Networks Training a neural network means parameter optimization. In this lesson, you will familiarize yourself with using differential calculus to compute gradients of a loss function with respect to the parameters of a neural network. You will understand the building blocks of multivariate calculus like the sum rule, the product rule, chain rule, the Jacobian and the Hessian.

  • Lesson 2: From Linear Algebra to Dimensionality Reduction The goal of dimensionality reduction is to replace a large matrix by two more other matrices whose sizes are smaller than the original, but from which the original can be approximately reconstructed, usually by taking their product. In this lesson, we will explore the basic concepts of linear algebra: eigenvalues, eigenvectors and matrix multiplication. You will be able to understand dimensionality reduction techniques like Principal-Component Analysis (PCA) and Singular-Value Decomposition (SVD).

  • Lesson 3: Maximum Likelihood Estimation and Gradient-Based Optimization In supervised machine learning, the cost functions are important for measuring the performance of a trained model. In this lesson, you will learn how to derive the cost function for regression (mean squared error) and binary classification (cross entropy) using maximum likelihood estimation. You will also build up the intuition for gradient-based optimization, their difficulties, variants and tips for optimal performance.

Bio

Adewale (Wale) Akinfaderin is a Data Scientist at Amazon Web Services. His expertise is in machine learning, deep learning, statistical experimentation and general information theory. He has broad experience implementing and extending ML techniques to solve practical and business problems. In his spare time, he conducts research on Machine Learning for the Developing World. He is a volunteer researcher on machine translation for low-resourced languages with Masakhane and a Google Developer Expert in Machine Learning.

About

Mathematics for Data Science and Machine Learning Workshop Materials

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published