A machine learning pipeline to classify text messages from direct source or social media following a disaster.
- Installation
- Project Motivation
- File Descriptions
- Deployment
- Results
- Licensing, Authors, and Acknowledgements
The code have the following dependency
- Python3
- Flask
- numpy
- pandas
- sklearn
- catboost
- matplotlib libraries
This project was a part of Udacity Nano Degree program to build a model for an API that classifies disaster messages.
For this purpose, we use Figure Eight. Containing text messages, labeled with different categories.
There two csv datasets from Figure Eight located in the data
direcctory. disaster_categories.csv
contains categories of the messages located in disaster_messages.csv
the data
directory also contains the pre processing script for the datasets. The app
folder contains the Flask web app script. Finally the model
directory contains the machine learing pipeline and the saved model classifier.pkl
- To preprocess the data run the following command in your
data
directory.python process_data.py disaster_messages.csv disaster_categories.csv DisasterResponse
- To Train and save your trained model run the following in your
model
direcotrypython train_classifier.py ../data/DisasterResponse classifier.pkl <grid_search>
The grid_search is a boolean [True|False] parameter which decided to implement grid search of hyperparameters or not - Run the following command in the app's directory to run your web app.
python run.py
- Go to http://0.0.0.0:3001/
Front Page of the Webapp Message genere distribution Message categories distribution
Credit is due to Udacity Data Scientist NanoDegree Program!