Toxic-Comment-App

Note: This Repository is required for deployment of this project on Streamlit Cloud.

Web App Link :- https://gaurav-van-toxic-comment-web-app-app-24y37c.streamlitapp.com/

Project Repo: https://github.com/Gaurav-Van/Data_Science__Machine_Learning-Projects

Classifying Comments in Six different Categories including their Neutral Cases Using Concepts of NLP and ML

Toxic
Severe Toxic
Threat
Obscene
Insult
Identity Hate

Concept Used

Instead of Multiclass classification, Binary Classification of Each Category is performed

1. Data Collection - From Kaggle: https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

2. Data Pre-Procesing - Text Pre-Processing Using Regular Expressions

Removing \n characters
Removing Aplha-Numeric Characters
Removing Punctuations
Removing Non Ascii Characters

3. EDA - Performaing Data analysis to Discover some Issues and trend of the Data

Through Bar charts of Each Category :- Prob = Class Imbalance -> Solution = Making Frequency of 0s equal to Frequency of 1s by Making Different Dataset of each Category [ id, comment_text, category].
Helps to solve the Issue of Class Imbalance and Helps in Binary Classification of Each Category

4. Model Building

VECTORIZATION :- Using TF-IDF and Unigram Approach
Model Used For Each Category :- KNN, Logistic Regression, SVM, CNB, BNB, DT and RF
Model Selected/b> - Logistic Regression

Exporting Trained ML Models as 6 pickle files [ one of each category ]

Exporting Trained Vectorized Models as 6 pickle files [ one for each category ]

5. Deployment - Building web app with the help of streamlit and deploying it on Streamlit cloud

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
App.py		App.py
Image.jpg		Image.jpg
README.md		README.md
identity_hate_model.pkl		identity_hate_model.pkl
identity_hate_vect.pkl		identity_hate_vect.pkl
insult_model.pkl		insult_model.pkl
insult_vect.pkl		insult_vect.pkl
obscene_model.pkl		obscene_model.pkl
obscene_vect.pkl		obscene_vect.pkl
requirements.txt		requirements.txt
severe_toxic_model.pkl		severe_toxic_model.pkl
severe_toxic_vect.pkl		severe_toxic_vect.pkl
threat_model.pkl		threat_model.pkl
threat_vect.pkl		threat_vect.pkl
toxic_model.pkl		toxic_model.pkl
toxic_vect.pkl		toxic_vect.pkl

Gaurav-Van/Toxic-Comment-Web_App

Folders and files

Latest commit

History

Repository files navigation

Toxic-Comment-App

Note: This Repository is required for deployment of this project on Streamlit Cloud.

Concept Used

About

Topics

Resources

Stars

Watchers

Forks

Languages