Naive Bayes Classifiers Comparison

Explored alternatives for improved performance in ML course, uOttawa 2023. This repository contains Python code implementing a Spambase Dataset analysis comparing Naïve Bayes classifiers. Evaluated accuracy, confusion matrices on different splits in a Spambase dataset as part of a Machine Learning course project at my study in the University of Ottawa in 2023.

Required libraries: scikit-learn, pandas, matplotlib.
Execute cells in a Jupyter Notebook environment.
The uploaded code has been executed and tested successfully within the Google Colab environment.

Binary-class classification problem

Task is to classify the email dataset into two classes: Spam / Not Spam.

Independent Variables:

57 Features related to word frequencies, character frequencies, and capital run lengths.

Target variable:

'Target' indicating the classification into two classes.

Key Tasks Undertaken

Dataset Splitting:
- Divided the dataset into 80% training and 20% test samples, preserving the split for later analysis.
Classifier Evaluation (80/20 Split):
- Computed confusion matrices and accuracy scores for Gaussian and Multinomial Naïve Bayes classifiers on test data.
  - Identified that both classifiers didn't predict any spam instances due to unbalanced test data.
Further Evaluation:
- Employed train-test split function, noting dataset shuffling to avoid zero instances of 'spam' in test data.
Alternate Classifier Assessment:
- Explored Bernoulli and Complement Naïve Bayes classifiers, comparing their performance metrics with Gaussian and Multinomial models.
Subset Evaluation:
- Analyzed four subsets' accuracies, revealing varied performances due to biased training on specific class labels.
Visualization:
- Presented subset accuracies via a bar chart, highlighting classifier performance variations.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
NBC-Comparison.ipynb		NBC-Comparison.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

NBC-Comparison.ipynb

NBC-Comparison.ipynb

README.md

README.md

Repository files navigation

Naive Bayes Classifiers Comparison

Binary-class classification problem

Independent Variables:

Target variable:

Key Tasks Undertaken

About

Releases

Packages

Languages

License

RimTouny/Naive-Bayes-Classifiers-Comparison

Folders and files

Latest commit

History

Repository files navigation

Naive Bayes Classifiers Comparison

Binary-class classification problem

Independent Variables:

Target variable:

Key Tasks Undertaken

About

Topics

Resources

License

Stars

Watchers

Forks

Languages