Notes on machine learning: Theory and examples.
Notes on machine learning (ML) provides a summary of the theory and examples using R and Python. There are great books that introduce the theory of ML and statistical learning [1]-[3]. In the case of examples, there is also a large number of sites and blogs. Many blogs are a great sources of implementations but in many cases the mathematical descriptions and links between theory and examples are missing.
This repository gathers the theory and examples from the course in Statistical Learning by T Hastie and R Tibshirani from Stanford University (online course), several books on statistical learning and machine learning [1]-[3] and a diverse source of examples based on different sources (Scikit-Learn Machine Learning in Python, examples in R from the Statistical Learning course, among others). Examples are provided using google colab cloud services.
The first part of this repository comprises theory and examples on statistical learning and machine learning:
-
Theory: Theory and definitions on statistcs, information theory and machine learning.
-
Descriptive statistics/Exploratory analysis
- Exploratory image analysis - Density plots: colab notebook, Medium article
- Exploratory image analysis - Projection embeddings on Tensorboard: colab notebook, Medium article.
-
Classification
- Classification: Introduction to classification methods (logistic regression, LDA, QDA).
- Classification in R - Example S&P 500: Predict direction on percentage return of S&P 500 using logistic regression, LDA and QDA.
- Classification in R - Example Caravan insurance: Predict customers that buy insurance using logistic regression and K-NN.
- Classification in Python - Example iris dataset: Comparison of wide range ol algorithms: logistic regression, K-NN, SVC, decission trees, random forest and ensembme methods (voting classifier and Adaboost).
-
Markov random fields: Introduction to to directed graphs or Markov Random Fields (MRF)
-
Model selection, assessment and resampling methods: Validation strategies, cross validation and boostrap methods.
- Setup for using R in google colab: setup_for_using_R
[1] Trevor Hastie and Robert Tibshirani, and Jerome Friedman. The elements of statistical learning, Springer New York Inc., Springer Series in Statistics, 2001.
[2] Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An introduction to statistical learning with applications in R, Springer New York Heidelberg Dordrecht London, ISBN 978-1-4614-7137-0, 2013.
[3] Christopher M Bishop. Pattern recognition and machine learning, Springer Science+Business Media, LCC, 2006.