Handwritten Digit Classification (with Custom Feature Extractor)

Classification on MNIST Dataset, utilising only traditional Machine Learning techniques. This is a course project of the course Pattern Recognition And Machine Learning of IIT Jodhpur, taught in Semester-II of Academic Year 2023-24.

Prerequisites

Downloading the MNIST Dataset is necessary for execution of the files.

The dataset is available in MNIST Dataset

Download the .CSV files of the training and test set.

Create a folder MNIST_CSV on the main branch, and inside the folder, store the two datasets as MNIST_train.csv and MNIST_test.csv

Instruction regarding saving files and images

Keep pickle.dump() and plt.savefig() statements uncommented if you wish to save the trained classifiers or images.

Order of execution of files

experiment_with_classifiers - Trains different classifiers on different variations of the MNIST dataset.
augmentation_code - Generates variations of the training dataset to have more training examples for the best models. File creates and saves a .feather file.
save_custom_transformed_data - Creates and saves two .feather files, which are used for training and testing the best models.
best_models - Trains the best models on the augmented dataset, and saves the classifiers are .pkl files for later use.
prediction_real_img - Provides prediction of two handwritten digits, 3.jpg and 7.jpg, clicked on camera. Certain preprocessing steps are involved before prediction.

The above files won't work properly if not executed in the above order.
gen_augmentation_images is used to view the different variations of a single image, after data augmentation. Can be used after execution of file 2.
failure_case_best_model is used to perform failure case analysis of the best model obtained after training. File is available for execution after executing file 4.

Types of Datasets used for training

Original dataset (with normalisation)
Principal Component Analysis (dimensionality reduction)
Linear Discriminant Analysis (dimensionality reduction)
Edge Detector with Prewitt Kernel (feature extractor)
Custom feature extractor
Augmented Dataset for larger training set

Classifiers used

K-Nearest Neighbors
Decision Trees
Linear Regression
Naive Bayes (Gaussian and Multinomial)
Random Forest
AdaBoost
Histogram Gradient Boosting Classifier
Support Vector Machines with Radial Basis Function Kernel

Maximum Accuracy Achieved

98.08%

Report

View the report to get an in-depth understanding of the project.

Slides

A concise presentation about the project.

Interface

App folder contains the code to the app hosted on Hugging Face.
Link to interface.

Authors

Ankit Kumar (B22CS076)
Rishabh Acharya (B22CS090)
Pujit Jha (B22CS091)
Raj Nandan Singh (B22EE052)
Ayush Pekamwar (B22EE084)

Group No: 32

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
Images		Images
app		app
pkl_objects		pkl_objects
3.jpg		3.jpg
7.jpg		7.jpg
README.md		README.md
augmentation_code.ipynb		augmentation_code.ipynb
best_models.ipynb		best_models.ipynb
experiment_with_classifiers.ipynb		experiment_with_classifiers.ipynb
failure_case_best_model.ipynb		failure_case_best_model.ipynb
gen_augmentation_images.ipynb		gen_augmentation_images.ipynb
prediction_real_img.ipynb		prediction_real_img.ipynb
report.pdf		report.pdf
save_custom_transformed_data.ipynb		save_custom_transformed_data.ipynb
slides.pdf		slides.pdf

rishz09/prml-course-project

Folders and files

Latest commit

History

Repository files navigation

Handwritten Digit Classification (with Custom Feature Extractor)

Prerequisites

Instruction regarding saving files and images

Order of execution of files

Types of Datasets used for training

Classifiers used

Maximum Accuracy Achieved

Report

Slides

Interface

Authors

About

Topics

Resources

Stars

Watchers

Forks

Languages