Course: Applied Machine Learning

This course covers the applied/coding side of algorithmics in machine learning and deep learning, with a smidgen of evolutionary algorithms.

Syllabus: What is machine learning (ML), Python, applying ML, supervised learning, dimensionality reduction, unsupervised learning, deep networks, evolutionary algorithms

Expanded Syllabus

5-Min Reads

Topics

Cheat Sheets

Algorithm Pros and Cons

Resources: Evolutionary Algorithms, Machine Learning, Deep Learning

Expanded Syllabus

What is machine learning (ML)?
Basics of Python programming
Applying ML: evaluation, dataset splits, cross-validation, performance measures, bias/variance tradeoff, visualization, confusion matrix, choosing estimators, hyperparameter tuning, statistics
Supervised learning: models, features, objectives, model training, overfitting, regularization, classification, regression, gradient descent, k nearest neighbors, linear regression, logistic regression, decision tree, random forest, adaptive boosting, gradient boosting, support vector machine, naïve Bayes
Dimensionality reduction: principal component analysis
Unsupervised learning: hierarchical clustering, k-means, t-SNE
Deep networks: backpropagation, deep neural network, convolutional neural network
Evolutionary algorithms: genetic algorithm,genetic programming

5-Min Reads

Here you can find short reads on various topics related to this course: machine learning, deep learning, evolutionary algorithms, artificial intelligence, and more.

Topics

(: my colab notebooks, : my medium articles)

Python
- Python Programming
- Python
- PyCharm
- Pandas
- NumPy
- NumPy
- Numba
- np.dot vs loop example
Artificial Intelligence
- Computing Machinery and Intelligence
- Four Takeaways on the Race to Amass Data for A.I.
Date Science
- Data Science Landscape
- Building the Machine Learning Model
- 21 Most Important (and Must-know) Mathematical Equations in Data Science
Machine Learning Intro
- Machine Learning: history, applications, recent successes
- How to avoid machine learning pitfalls
- Top 10 machine learning algorithms with their use-cases
- Introduction to machine learning
- Train/Val/Test
- Simple weather example
- Iris knn (map)
- kfold
Scikit-learn
- Machine learning with scikit-learn
- A Minimal Example of Machine Learning (with scikit-learn)
- 19 Most Elegant Sklearn Tricks I Found After 3 Years of Use
- Toy datasets (sklearn)
ML Models
- Machine learning models
Decision trees
- Decision trees
- Decision trees
Random Forest
- Random Forests
- Random Forests
Linear Regression
- Linear Regression
- LinearReg
- A Quick Summary of Linear Regression
Logistic Regression
- Logistic Regression
- Logistic Regression
- Cross-Entropy Loss
- Linear Regression vs Logistic Regression in a Nutshell
- LinVsLog
- PolynomialFeatures
Linear Models
- Optimization of linear models
Regularization: Ridge & Lasso
- Ridge vs. Lasso
- notes on regularization in ML
AdaBoost
- Adaptive Boosting
Gradient Boosting
- Gradient Boosting
- Why tree gradients give you a boost
AddGBoost
- AddGBoost
- Strong(er) Gradient Boosting
Ensembles
- Two’s Company, Three’s an Ensemble
XGBoost
- XGBoost
Comparing ML algorithms
- Comparing supervised learning algorithms
- How to find the best performing Machine Learning algorithm
- racist data destruction?
Gradient Descent
- Gradient Descent
- Least Squares
- Least Squares
- Stochastic Gradient Descent
- Stochastic Gradient Descent Algorithm With Python and NumPy
Choosing a Model
- Choosing the right estimator
SVM
- Support Vector Machine
- sklearn.svm.SVC
- Plot different SVM classifiers in the iris dataset
Bayesian
- Multinomial Naive Bayes
Metrics
- sklearn.metrics
- Confusion Matrix
- Sensitivity and Specificity
- Sensitivity and specificity
- ROC and AUC
- Online ROC Curve Calculator
- balanced accuracy
- balanced accuracy
- various metrics from CM
ML in practice
- ML in practice
Data Leakage
- Data Leakage Basics, with Examples in Scikit-Learn
- Could machine learning fuel a reproducibility crisis in science?
Dimensionality Reduction
- 11 Different Uses of Dimensionality Reduction
- PCA
- PCA
- PCA vs LR
- pca
- t-SNE
- tsne
Clustering
- Clustering
- Hierarchical clustering
- K-means clustering
- kmeans
- From Data to Clusters: When is Your Clustering Good Enough?
Hyperparameters
- Model Parameter vs. Hyperparameter
- Hyperparameter tuning
- Hyperparameter tuning
- Optuna
- optuna
- Evaluating Hyperparameters in Machine Learning
Some Topics in Probability
- p-values
- how to calculate p-values
- p-hacking
- Probability is not Likelihood
- 17 Statistical Hypothesis Tests in Python
- t-test
- t-test
- t-test vs p-value
- scpipy ttest
- chi-square
- permutation test (AddGBoost + code)
- The Permutation Test: A Data Scientist’s BFF
Feature Importances
- Decision Tree
- Explainable AI explained! | #4 SHAP
- SHAP Values Explained
Neural Networks
- Neural networks
- Neural networks
- Tinker With a Neural Network in Your Browser
Deep Learning
- Neural Networks with À La Carte Selection of Activation Functions
- PyTorch
- PyTorch
- Double Descent
- Overparameterization, Backpropagation, Alimentation: Them and Us
- conv demo
- convolution
- A simple image convolution
- Implementing Image Processing Kernels from scratch using Convolution in Python
- Introduction to image generation (diffusion)
- Loss is Boss and other articles in the DL section
Large Language Models
- Introduction to large language models
- Scikit-LLM
- A Tiny Large Language Model (LLM), Coded, and Hallucinating
DL and AI
- Growth of AI computing
- AI move from Academia
- Artificial General Intelligence Is Not as Imminent as You Might Think
Evolutionary Algorithms: Basics
- Evolutionary Computation
- How to Build a Genetic Algorithm from Scratch in Python with Just 33 Lines of Code
- Evolutionary Algorithms, Genetic Programming, and Learning
- Tiny GA
- EC-KitY
- Genetic Programming (Koza)
- Tiny GP
- Koza Tutorial
- Koza's vids
- Koza & Poli
- Yoo
- Herrmann
Evolutionary Algorithms: Advanced
- Multi-Objective Optimization
- Schema theorem
- Linear GP
- Cartesian GP
- Grammatical Evolution
- Coevolutionary Computation
- New Pathways in Coevolutionary Computation
- Novelty search
- Humies
- Evolutionary Art
- Building Activation Functions for Deep Networks
- Evolutionary Adversarial Attacks on Deep Networks

Cheat Sheets

Algorithm Pros and Cons

KN Neighbors
✔ Simple, No training, No assumption about data, Easy to implement, New data can be added seamlessly, Only one hyperparameter
✖ Doesn't work well in high dimensions, Sensitive to noisy data, missing values and outliers, Doesn't work well with large data sets — cost of calculating distance is high, Needs feature scaling, Doesn't work well on imbalanced data, Doesn't deal well with missing values
Decision Tree
✔ Doesn't require standardization or normalization, Easy to implement, Can handle missing values, Automatic feature selection
✖ High variance, Higher training time, Can become complex, Can easily overfit
Random Forest
✔ Left-out data can be used for testing, High accuracy, Provides feature importance estimates, Can handle missing values, Doesn't require feature scaling, Good performance on imbalanced datasets, Can handle large dataset, Outliers have little impact, Less overfitting
✖ Less interpretable, More computational resources, Prediction time high
Linear Regression
✔ Simple, Interpretable, Easy to Implement
✖ Assumes linear relationship between features, Sensitive to outliers
Logistic Regression
✔ Doesn’t assume linear relationship between independent and dependent variables, Output can be interpreted as probability, Robust to noise
✖ Requires more data, Effective when linearly separable
Lasso Regression (L1)
✔ Prevents overfitting, Selects features by shrinking coefficients to zero
✖ Selected features will be biased, Prediction can be worse than Ridge
Ridge Regression (L2)
✔ Prevents overfitting
✖ Increases bias, Less interpretability
AdaBoost
✔ Fast, Reduced bias, Little need to tune
✖ Vulnerable to noise, Can overfit
Gradient Boosting
✔ Good performance
✖ Harder to tune hyperparameters
XGBoost
✔ Less feature engineering required, Outliers have little impact, Can output feature importance, Handles large datasets, Good model performance, Less prone to overfitting
✖ Difficult to interpret, Harder to tune as there are numerous hyperparameters
SVM
✔ Performs well in higher dimensions, Excellent when classes are separable, Outliers have less impact
✖ Slow, Poor performance with overlapping classes, Selecting appropriate kernel functions can be tricky
Naïve Bayes
✔ Fast, Simple, Requires less training data, Scalable, Insensitive to irrelevant features, Good performance with high-dimensional data
✖ Assumes independence of features
Deep Learning
✔ Superb performance with unstructured data (images, video, audio, text)
✖ (Very) long training time, Many hyperparameters, Prone to overfitting

Resources: Evolutionary Algorithms, Machine Learning, Deep Learning

Vids

John Koza Genetic Programming (YouTube)
גיא כתבי - אלגוריתמים אבולוציוניים (YouTube) [גיא בוגר הקורס שלי: אלגוריתמים אבולוציוניים וחיים מלאכותיים]
StatQuest with Josh Starmer
ML YouTube Courses
Machine Learning Essentials for Biomedical Data Science: Introduction and ML Basics
Artificial Intelligence Under Fire: Attacking and Defending Deep Neural Networks

Basic Reads

Advanced Reads

Books (🡇 means free to download)

M. Sipper, Evolved to Win, Lulu, 2011 🡇
M. Sipper, Machine Nature: The Coming Age of Bio-Inspired Computing, McGraw-Hill, New York, 2002
A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing, Springer, 1st edition, 2003, Corr. 2nd printing, 2007
R. Poli, B. Langdon, & N. McPhee, A Field Guide to Genetic Programming, 2008 🡇
J. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, Cambridge, MA, 1992.
S. Luke, Essentials of Metaheuristics, 2013 🡇
A. Geron, Hands On Machine Learning with Scikit Learn and TensorFlow, 2017 🡇
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning, 2nd edition, 2021 🡇
J. VanderPlas, Python Data Science Handbook
K. Reitz, The Hitchhiker’s Guide to Python
M. Nielsen, Neural Networks and Deep Learning
Z. Michalewicz & D.B. Fogel, How to Solve It: Modern Heuristics, 2nd ed. Revised and Extended, 2004
Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag, Berlin, 3rd edition, 1996
D. Floreano & C. Mattiussi, Bio-Inspired Artificial Intelligence: Theories, Methods, and Technologies, MIT Press, 2008
A. Tettamanzi & M. Tomassini, Soft Computing: Integrating Evolutionary, Neural, and Fuzzy Systems, Springer-Verlag, Heidelberg, 2001
M. Mohri, A. Rostamizadeh, and A. Talwalka, Foundations of Machine Learning, MIT Press, 2012 🡇
Simon J.D. Prince, Understanding Deep Learning, MIT Press, 2023 🡇

Software

Datasets

Name		Name	Last commit message	Last commit date
Latest commit History 295 Commits
README.md		README.md
colab.png		colab.png
medium.png		medium.png
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

colab.png

colab.png

medium.png

medium.png

setup.py

setup.py

Repository files navigation

Course: Applied Machine Learning

About

Releases

Packages

Languages

moshesipper/Applied-Machine-Learning-Course

Folders and files

Latest commit

History

Repository files navigation

Course: Applied Machine Learning

About

Topics

Resources

Stars

Watchers

Forks

Languages