IMDB Sentiment Analysis Model

This is a Sentiment Analysis Model built using Machine Learning and Deep Learning to classify movie reviews from the IMDB dataset into "positive" and "negative" classes.

Introduction

Sentiment Analysis has been a classic field of research in Natural Language Processing, Text Analysis and Linguistics. It essentially attempts to identify, categorize and possibly quantify, the opinions expressed in a piece of text and determine the author's attitude toward a topic, product or situation. This has widespread application in Recommender systems for predicting the preferences of users and in e-commerce websites to analyse customer feedback & reviews. Based on the sentiments extracted from the data, companies can better understand their customers and align their businesses accordingly.
Before the advent of the Deep Learning era, Statistical methods and Machine Learning techniques found ample usage for Sentiment Analysis tasks. With the increase in the size of datasets and text corpora available on the internet, coupled with advancements in GPUs and computational power available for these tasks, Neural Networks have ushered in and vastly improved the state-of-the-art performance in various NLP tasks, and Sentiment Analysis remains no exception to this. Recurrent Neural Networks (RNN), Gated RNNs, Long-Short Term Memory networks (LSTM) and 1D ConvNets are some classic examples of neural architectures which have been successful in NLP tasks.

Dataset

This project uses the Large Movie Review Dataset which has been in-built with Keras. This dataset contains 25000 highly polar movie reviews for training, and another 25000 reviews for testing. It does not contain more than 30 reviews for any single movie, and also ensures there are equal number of positive and negative reviews in the both the training and test sets. Additionally, neutral reviews (those with rating 5/10 or 6/10) have been excluded. This dataset has been a benchmark for many Sentiment Analysis tasks, since it was first released in 2011.

Models

I built and experimented with different models to compare their performance on the dataset -

Long Short-Term Memory Network:

Recurrent Neural Networks are especially suited for sequential data (sequence of words in this case). Unlike the more common feed-forward neural networks, an RNN does not input an entire example in one go. Instead, it processes a sequence element-by-element, at each step incorporating new data with the information processed so far. This is quite similar to the way humans too process sentences - we read a sentence word-by-word in order, at each step processing a new word and incorporating it with the meaning of the words read so far.

A diagram of a recurrent neural network

A diagram of an LSTM network.

LSTMs further improve upon these vanilla RNNs. Although theoretically RNNs are able to retain information over many time-steps ago, practically it becomes extremely difficult for simple RNNs to learn long-term dependencies, especially in extremely long sentences and paragraphs. LSTMs have been designed to have special mechanisms to allow past information to be reutilised at a later time. As a result, in practise, LSTMs are almost always preferable over vanilla RNNs.
Here, I built an LSTM model using Keras Sequential API. A summary of the model and its layers is given below. The model was trained with a batch size of 64, using the Adam Optimizer.

A plot of the model and its layers

While tuning the hyper-parameters, a Dropout layer was introduced as measure of regularization to minimize the overfitting of the model on the training dataset. A separate validation set (taken from the training data) was used to check the performance of the model during this phase.
This model managed to achieve an accuracy of 85.91% when evaluated on the hidden test dataset (and 99.96% on the training dataset).
Convolutional Network:

The idea of Convolutional Networks has been quite common in Computer Vision. The use of convolutional filters to extract features and information from pixels of an image allows the model to identify edges, colour gradients, and even specific features of the image like positions of eyes & nose (for face images). Apart from this, 1D Convolutional Neural Networks have also proven quite competitive with RNNs for NLP tasks. Given a sequential input, 1D CNNs are well able to recognize and extract local patterns in this sequence. Since the same input transformation is performed at every patch, a pattern learned at a certain position in the sequence can very easily later be recognized at a differnt position. Further, in comparison to RNNs, ConvNets in general are extremely cheap to train computationally - In the current project (built using Google Colaboratory with a GPU kernel), the LSTM model took more than 30 minutes to complete an epoch (during training) while the CNN model took hardly 9 seconds on average!
I built the model using Keras Sequential API. A summary of the model and its layers is below.

A plot of the model and its layers

This model was trained with a batch size of 64 using Adam Optimizer. The best model (weights and the architecture) was saved during this phase. This model achieved an accuracy of 89.7 % on the test dataset, a good increase over the LSTM model.

Frameworks, Libraries & Languages

Keras
Tensorflow
Python3
Matplotlib

Usage

On the terminal run the following commands-

Install all dependencies
pip install python3
pip install matplotlib
pip install tensorflow
pip install keras
Clone this repository on your system and head over to it
git clone https://github.com/matakshay/IMDB_Sentiment_Analysis
cd IMDB_Sentiment_Analysis
Either of the CNN or LSTM model can be used to predict for a custom movie review.
To run the LSTM model -
python3 LSTM_predict.py
This loads the LSTM model with its weights and prompts for an input.

To run the CNN model -
python3 CNN_predict.py
This loads the CNN model with its weights and prompts for an input.
Type a movie review (in English) in the terminal and get its sentiment class predicted by the model

Acknowledgement

I studied and referred many articles, books and research papers while working on this project. I am especially grateful to the authors of the following for their work -

https://colah.github.io/posts/2015-08-Understanding-LSTMs/
https://medium.com/@romannempyre/sentiment-analysis-using-1d-convolutional-neural-networks-part-1-f8b6316489a2
Deep Learning with Python by François Chollet

Some other websites I referred -

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
CNN_model		CNN_model
LSTM_model		LSTM_model
.gitignore		.gitignore
CNN_model_visual.png		CNN_model_visual.png
CNN_predict.py		CNN_predict.py
LSTM.png		LSTM.png
LSTM_model_visual.png		LSTM_model_visual.png
LSTM_predict.py		LSTM_predict.py
README.md		README.md
RNN.png		RNN.png
source_code.py		source_code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNN_model

CNN_model

LSTM_model

LSTM_model

.gitignore

.gitignore

CNN_model_visual.png

CNN_model_visual.png

CNN_predict.py

CNN_predict.py

LSTM.png

LSTM.png

LSTM_model_visual.png

LSTM_model_visual.png

LSTM_predict.py

LSTM_predict.py

README.md

README.md

RNN.png

RNN.png

source_code.py

source_code.py

Repository files navigation

IMDB Sentiment Analysis Model

TABLE OF CONTENTS

Introduction

Dataset

Models

Long Short-Term Memory Network:

Convolutional Network:

Frameworks, Libraries & Languages

Usage

Acknowledgement

About

Languages

matakshay/IMDB_Sentiment_Analysis

Folders and files

Latest commit

History

Repository files navigation

IMDB Sentiment Analysis Model

TABLE OF CONTENTS

Introduction

Dataset

Models

Long Short-Term Memory Network:

Convolutional Network:

Frameworks, Libraries & Languages

Usage

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Languages