Real or Not? NLP with Disaster Tweets.

Classifying whether a disaster tweet is real or not using RNN with LSTM and GloVe word embeddings. The model gave an accuracy of 80% on both train and validation data set with learning rate 5e-5, predicting whether a given tweet is about a real disaster or not. If so, predicted as 1. If not, predicted as 0. The datasets have been taken from Kaggle Data sets

The kaggle notebook for running file can be viewed here

Each sample in the train and test set has the information about the text of a tweet, A keyword from that tweet (although this may be blank!) and The location the tweet was sent from (may also be blank)

CSV data set has Columns as:

id - a unique identifier for each tweet text - the text of the tweet location - the location the tweet was sent from (may be blank) keyword - a particular keyword from the tweet (may be blank) target - in train.csv only, this denotes whether a tweet is about a real disaster (1) or not (0)

EDA performed on data sets are

1. Data processing

1.1 Handling Misspelled data

1.2 Handling Contractions

1.3 Replacing Abbreviations

1.4 Visualizing length of tweets

1.5 Visualizing word count in each tweet

1.6 Collecting all words

2. Visualizing and data attributes

2.1 Viewing most common stop words used in tweets

2.2 Viewing Punctuations in tweets

2.3 Viewing Common words in tweets

2.4 N-gram analysis

3. Data cleaning

3.1 Cleaning URLs and HTML tags

3.2 Cleaning Punctuations and emojis

3.3 Cleaning stop words

3.4 Using Glo-Ve for word embeddings

3.5 Train-Test split

4. Creating Model

4.1 LSTM Model with Glove Embeddings

4.2 Plotting accuracy and loss curves

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Abbreviations and Slang.csv		Abbreviations and Slang.csv
LICENSE		LICENSE
README.md		README.md
contractions.csv		contractions.csv
loss_acc_plot.png		loss_acc_plot.png
nlp-on-disaster-tweet.ipynb		nlp-on-disaster-tweet.ipynb
test.csv		test.csv
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abbreviations and Slang.csv

Abbreviations and Slang.csv

LICENSE

LICENSE

README.md

README.md

contractions.csv

contractions.csv

loss_acc_plot.png

loss_acc_plot.png

nlp-on-disaster-tweet.ipynb

nlp-on-disaster-tweet.ipynb

test.csv

test.csv

train.csv

train.csv

Repository files navigation

Real or Not? NLP with Disaster Tweets.

EDA performed on data sets are

About

Releases

Packages

Languages

License

naureen20/Real-or-Not-NLP-with-Disaster-Tweets

Folders and files

Latest commit

History

Repository files navigation

Real or Not? NLP with Disaster Tweets.

EDA performed on data sets are

About

Topics

Resources

License

Stars

Watchers

Forks

Languages