Skip to content

End to end implementation and deployment of Machine Learning Spam Detection using python, flask, gunicorn, scikit-Learn, nltk, etc. on Heroku web application platform.

License

Notifications You must be signed in to change notification settings

divyansh1195/ML-Spam-Detection

Repository files navigation

ML-Spam-Detection-Deployment

Kaggle Python 3.6 Scikit-LearnNLTK

This repository consists of files required for end to end implementation and deployment of Machine Learning Spam Detection web application created with flask and deployed on the Heroku platform.

Table of Contents

App Link

If you want to view the deployed model, click on the following link:
https://allysonspamdetector.herokuapp.com/

A glimpse of the web app:

GIF GIF

• If you encounter this webapp as shown in the picture given below, it is occuring just because free dynos for this particular month provided by the Heroku platform have been completely used. You can access the webpage on 1st of the next month.

• Sorry for the inconvenience.

Heroku-Error

About the App

The ML Spam Detection is a Flask web application which predicts whether the message is a spam or not. SMS Spam Collection dataset from Kaggle was used to classify the messages into 2 classes- Ham(1) and Spam(0) using stemming, Bag of Words model and Naive Bayes Classifiers.

Note:The dataset is an unbalanced dataset and therefore, for this situation the role of Precision becomes quite important.Precision is more focused in the positive class than in the negative class, it actually measures the probability of correct detection of positive values,

Consider the following case scenario -'suppose if the message is not a spam and if it's been predicted by the model as a spam, the consumer is going to miss that message.' So, for this type of unbalanced dataset, precision defined as {TP/(TP+FP)} plays an important role along with accuracy_score. My objective was to reduce the FP(False Positive) value as much as possible for this case and in order to overcome this issue, Naive Bayes classifiers namely, MultinomiallNB and BernoulliNB were implemented to get best accuracy_score and precision_score from the dataset.

The code is written in Python 3.6.10. If you don't have Python installed, you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:

pip install -r requirements.txt

Deployement on Heroku

Login or signup in order to create virtual app. You can either connect your github profile or download ctl to manually to deploy this project.

The next step would be to follow the instruction given in the Heroku Documentation to deploy a web app.

Technologies Used

Bug / Feature Request

If you find a bug (the website couldn't handle the query and / or gave undesired results), kindly open an issue here by including your search query and the expected result

Please do ⭐ the repository, if it helped you in anyway.