Supervised Machine Learning - Classification

Goal

To explore the practical application of ML by trying to predict poisonous mushrooms, noticing the trade off between accuracy and safety.

Overview

We are interested in keeping Catalonian mushroom foragers safe from poisonous mushrooms, and therefore our aim is to completely eliminate Type II errors.

Context

In general, the aim of fine-tuning and perfecting the algorithms is to get our accuracy close to perfection. However, this time around the emphasis is on Error Types and the delicate dance between accuracy and safety.

Are there any ML algorithms that by default err on the side of caution?
Can we achieve 0 hospital cases with adjusting tresholds and exploring ROC curves?

Task:

Import mushroom database
Explore and analyze features
Experiment with several ML models
Experiment with tresholds while keeping an eye on accuracy
Explore ROC curve
Test our algorithm on data it has never seen before
Rinse and repeat

Deliverables

The Google Colab Notebook for trying out different ML algorithms is found here with a supporting Medium article that outlines my thinking process and practical takeaways more in detail here.

Skills & Tools

Data Reading & Cleaning
Data Splitting
Building a Preprocessor
LazyPredict & Modelling
Error Analysis
Tresholds and ROC Curve analysis

Note to the Reader about my choice of models to try:

My aim after running LazyPredict was to experiment with algorithms based on various mathematical models. RandomForest is a Decision Tree-based classifier, Label Propagation is a semi-supervised learning model, LGBM is a gradient boosting method, KNN groups data into “neighborhoods” based on similarities, while SVC looks for and calculates distances for the optimal hyperplane to divide the data into classes. By exploring various methods based on different mathematical models, I was curious whether any one of them would be more or less prone to a certain error type.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Mushrooms.ipynb		Mushrooms.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mushrooms.ipynb

Mushrooms.ipynb

README.md

README.md

Repository files navigation

Supervised Machine Learning - Classification

Goal

Overview

Context

Task:

Deliverables

Skills & Tools

Note to the Reader about my choice of models to try:

About

Releases

Packages

Languages

Cintia0528/Data_Science-Supervised_Machine_Learning_Classification_Mushrooms

Folders and files

Latest commit

History

Mushrooms.ipynb

Mushrooms.ipynb

README.md

README.md

Repository files navigation

Supervised Machine Learning - Classification

Goal

Overview

Context

Task:

Deliverables

Skills & Tools

Note to the Reader about my choice of models to try:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages