Skip to content

This is a simple Spam classifier, which will classify email either spam or not. It uses Naive Bayes Algorithm for classification, which is implemented from scratch ;)

Notifications You must be signed in to change notification settings

overide/Naive-Spam-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Naive-Spam-Classifier

This is a simple Spam classifier, which will classify email either spam or not. It uses Naive Bayes Algorithm for classification, which is implemented from scratch ;)

Project Structure

Keep a folder named dataset where your training data resides.Directory structure is like this-

dataset
|
|-- spam
|-- not_spam

spam folder contain spam emails data and not_spam folder contain non spam email data
Spam Email dataset - https://spamassassin.apache.org/publiccorpus/20021010_spam.tar.bz2
Non Spam Email dataset - https://spamassassin.apache.org/publiccorpus/20021010_easy_ham.tar.bz2

Usage

NaiveBayesClassifier class have method classify which will take any message and output it's probability of being spam

>>>classifier = NaiveBayesClassifier()
>>>classifier.train(train_data)
>>>classifier.classify("Get free laptops now!")
>>>0.99453546456

Training dataset take a list of tuple (subject, is_spam). Subject is subject of email and is_spam is boolean value indication whether email is spam or not

About

This is a simple Spam classifier, which will classify email either spam or not. It uses Naive Bayes Algorithm for classification, which is implemented from scratch ;)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages