Disaster Response Pipeline Project

Deployed at Heroku

https://warm-falls-46451.herokuapp.com/

Description

Responding to diasters is an important task that needs to be quick and efficient. During a disaster, thousands of texts or messages flood the social media or any other news media that need to be paid attention. Based on the necessity in the message, it is forwarded to the relevant department and aid operations are carried out. During disasters, responsive teams are usually vunerable and simple key word mapping to classify a message might miss hidden nuances but should also be robust enough to makse sure it's aid related. A deployed machine learning model that has the capability to automatically classify the incoming messages is what this project is about.

There are 3 major componenets in this project

An ETL pipeline that extracts the data, cleans it and loads it into a postgres database.
A ML/NLP pipeline that loads the data from the database, performs training and optimizing operations to generate a model.
A web app, that takes new incoming messages, feed them to the trained model, predict the category of the message and displays it on the UI.

Dependencies

ORM - SqlAlchemy
Language - Python 3.7.9
ML - Sklearn, Numpy, Pandas
NLP - NLTK (wordnet, punkt, stopwords, average_perceptron_tagger)
Web app - Flask
Visualizations - Plotly

data/process_data.py File that takes in data, sends it through ETL pipeline and stores in database
models/train_classifier.py File that loads data from database, trains and stores the ML model into a pickle file.
data/disaster_messages.csv and data/disaster_categories.csv Data used to train the model, provided by FigureEight
run.py Flask web app
templates/master.html Main Html file and templates/go.html Html file that displays fetched results

Instructions:

Clone the repository by executing git clone https://github.com/siddarthaThentu/Disaster-Response-Pipeline.git
Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
Run the following command in the app's directory to run your web app. python run.py
Go to http://0.0.0.0:3001/

License

Author

Siddartha Thentu

Acknowledgements

Udacity The project was developed as a part of Udacity's Data Science Nanodegree Program.
FigureEight For providing the datasets to train the model.

Screenshots

Sample Message

Results

Future Improvments

Training data as seen in homepage looks skewed. Weighted training or capturing more data should handle the data bias.
A chance of data/concept drift in the future with changes in modern text languages.
Improving code performance by identifying bottlenecks.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
__pycache__		__pycache__
data		data
models		models
screenshots		screenshots
templates		templates
Procfile		Procfile
README.md		README.md
nltk.txt		nltk.txt
requirements.txt		requirements.txt
run.py		run.py
runtime.txt		runtime.txt
tokenFile.py		tokenFile.py

siddarthaThentu/Disaster-Response-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Pipeline Project

Deployed at Heroku

Description

Dependencies

Contents

Instructions:

License

Author

Acknowledgements

Screenshots

Future Improvments

About

Topics

Resources

Stars

Watchers

Forks

Languages