Multi-class Classification

This repo comprise of the codes developed for multi-class classification. The goal of this problem is to use the input variables to correctly classify or predict the target variable which is a multiclass categorical variable. There are a total of 150 input variables in the data.

This repo is uploaded in github to demonstrate the use of agile and git throughout the exercise.

To see the final report: You may download 'Final_Report.html'

Structuring a repository

An integral part of having reusable code is having a sensible repository structure. That is, which files do we have and how do we organise them.

Folder layout:

multiclass_classification
├── docs
├── data
├── models
│   └── models_new
├── results
├── src
│   └── analysis
│       └── __init__.py
│       └── analysis.py
|   └── train
│       └── __init__.py
│       └── train.py
|   └── Config.py
├── .gitignore
├── README.md
└── requirements.txt

1. About the Project

The following is the summary of data scienc/ml/deep learning approaches used in this project:

Data import and Pre-EDA
Data cleaning/ Missing data imputation
EDA: Outlier analysis
EDA: Statistical/Distribution analysis
EDA: Feature Engineering and Selection
Modelling - Baseline Model
Modelling - Hyperparameter Tuning and Regularization
Modelling - Regularization
Model Evaluation: Selection of Metrics

2. Getting Started - Clone the repository locally

You may git bash at any preferred folder location and run the following command:

git clone https://github.com/gracengu/multiclass_classification.git

Alternatively, although not recommended, you can download the zip file of the repository at the top of the
main page of the repository. If you prefer not to use git or don't have experience with it, this a good option.

3. Set up your environment

Note: the following instructions are specifically for pip users only.

The best practice is to create an isolated environment to avoid dependency conflicts in python. If this is the first
time you're setting up your compute environment, please first install pip and virtualenv.

python get-pip.py
pip install virtualenv

After installation, set up the virtual environment. If you have multiple python version installed in your PC, you may want to specify the python version you're using.

virtualenv --python=<path of python.exe with a specific version> python3.7_multiclass

To activate your environment, run the activate script from the virtualenv you have created:

python3.7_multiclass\Scripts\activate

Change the path for tensorflow installation:

tensorflow-cpu @ file:///C:/Projects/2021/multiclass_classification/support/tensorflow_cpu-2.6.0-cp37-cp37m-win_amd64.whl

Please install all of the packages listed in the requirement.txt using the following command:

pip install -r requirement.txt

If there is error installing tensorflow, comment out tensorflow from requirements.txt and perform the following commands:

pip install .\package\tensorflow_cpu-2.6.0-cp37-cp37m-win_amd64.whl
pip install -r requirements.txt

4. Open your Jupyter notebook

You will have to install a new IPython kernelspec to run the jupyter notebook in an isolated environment.

ipython kernel install --name python3.7_multiclass --user

You can change the --name to anything you want.

In the terminal, execute jupyter notebook.

Navigate to the notebooks directory and open notebook:

Baseline Model: Baseline Model.ipynb
Hyperparameter Tuning: Hyperparameter Tuning.ipynb

4. Final Report

The final report is written in jupyter notebook `Final report.ipynb'. Run the notebook from start to end, to generate the html report.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
docs		docs
models		models
notebooks		notebooks
results		results
src		src
.gitignore		.gitignore
Baseline Model.ipynb		Baseline Model.ipynb
Final Model Training.ipynb		Final Model Training.ipynb
Final Report.ipynb		Final Report.ipynb
Final_Report.html		Final_Report.html
Hyperparameter Tuning.ipynb		Hyperparameter Tuning.ipynb
LICENSE		LICENSE
README.md		README.md
multiclass_classification.ipynb		multiclass_classification.ipynb
requirements.txt		requirements.txt

License

gracengu/multiclass_classification

Folders and files

Latest commit

History

Repository files navigation

Multi-class Classification

Table of Contents

Structuring a repository

1. About the Project

2. Getting Started - Clone the repository locally

3. Set up your environment

4. Open your Jupyter notebook

4. Final Report

About

Topics

Resources

License

Stars

Watchers

Forks

Languages