Naive-Spam-Classifier

This is a simple Spam classifier, which will classify email either spam or not. It uses Naive Bayes Algorithm for classification, which is implemented from scratch ;)

Project Structure

Keep a folder named dataset where your training data resides.Directory structure is like this-

dataset
|
|-- spam
|-- not_spam

spam folder contain spam emails data and not_spam folder contain non spam email data
Spam Email dataset - https://spamassassin.apache.org/publiccorpus/20021010_spam.tar.bz2
Non Spam Email dataset - https://spamassassin.apache.org/publiccorpus/20021010_easy_ham.tar.bz2

Usage

NaiveBayesClassifier class have method classify which will take any message and output it's probability of being spam

>>>classifier = NaiveBayesClassifier()
>>>classifier.train(train_data)
>>>classifier.classify("Get free laptops now!")
>>>0.99453546456

Training dataset take a list of tuple (subject, is_spam). Subject is subject of email and is_spam is boolean value indication whether email is spam or not

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
naive_bayes_classifier.py		naive_bayes_classifier.py
spam_classification.py		spam_classification.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

naive_bayes_classifier.py

naive_bayes_classifier.py

spam_classification.py

spam_classification.py

Repository files navigation

Naive-Spam-Classifier

Project Structure

Usage

About

Releases

Packages

Languages

overide/Naive-Spam-Classifier

Folders and files

Latest commit

History

Repository files navigation

Naive-Spam-Classifier

Project Structure

Usage

About

Topics

Resources

Stars

Watchers

Forks

Languages