Skip to content

UvA-DSC/20220520_conventional-machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Conventional Machine Learning

Workshop on Machine Learning organised at the Data Science Center on 20 May 2022.

Introduction

In the past decade, Deep Learning, originally a subfield of Machine Learning, has gained considerable influence and momentum. In order to distinguish "traditional" Machine Learning such as Random Forest or Regularised Regression from Deep Learning models, we decided to coin this workshop "conventional Machine Learning".

Note: Machine Learning is often abbreviated as ML.

Learning objectives

  1. Get familiarised with Machine Learning terminology and terms.
  2. Learn about automated Machine Learning: choosing the best ML model
  3. Practice autoML implementations on two simple datasets, one for classification and one for regression.

Python package

To perform autoML, we will use the TPOT Python package that has a great documentation linked to it.

Installation

Option 1: To install TPOT using conda/mamba:
conda install -c conda-forge mamba && conda create --name tpot -c conda-forge jupyterlab tpot=0.11.6

A faster alternative to conda is called mamba and is used here.

Option 2: To install TPOT using pip:

  1. pip install virtualenv (if not available)
  2. virtualenv tpot_env
  3. source tpot_env/bin/activate
  4. pip install tpot==0.11.6

Link to the Python package repository of TPOT: https://pypi.org/project/TPOT/

Datasets used

Datasets can be found under 00_datasets/ as a .csv file and a .txt file describing the dataset.

Breast Cancer Wisconsin Data Set (Classification problem)

Taken from the UCI Machine Learning Repository Link

Student grades (Regression problem)

Taken from the UCI Machine Learning Repository Link

References

Credits

Authors:

  • Iris van der Knaap @Library, UvA Data Science Center Digital Skills Coordinator.
  • Casper Thuis, data scientist @IBED, UvA.
  • Marc Galland, support data scientist, @SILS, UvA.

Sources of inspiration

  1. Machine Learning at the Vrije University of Amsterdam: https://mlvu.github.io/
  2. Genetic programming in Python: https://towardsdatascience.com/genetic-algorithm-implementation-in-python-5ab67bb124a6

TPOT Python package

  1. TPOT home page
  2. Data Camp tutorial on TPOT

MJLAR

MJLAR GitHub repository

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published