Natural-Language-Inference

This repository contains a PyTorch implementation for an LSTM and self attention models in SNLI dataset, which is a part of final project for LUS course. The self attention model follows this paper: A Decomposable Attention Model for Natural Language Inference https://arxiv.org/abs/1606.01933

Project structure:

folder data: contains SNLI dataset and embedding matrices
folder preprocessed: contains data after preprocessing
folder save_model: contains models after training
file preprocess.py: code to preprocess data, including extracting data, loading word embedding matrix,...
file dataset.py: a class to handle dataset for training
file model.py: a class describing network architectures
file train.py: contains function for training and testing procedures
file main.py: main file to run program

How to run

Download data

Firstly you have to download data including SNLI data and Embedding matrices:

Download the SNLI dataset from https://nlp.stanford.edu/projects/snli/snli_1.0.zip and GLoVe matrices from http://nlp.stanford.edu/data/glove.6B.zip and http://nlp.stanford.edu/data/glove.42B.300d.zip then extract all of them into folder data

If you don't want to do it manually, I also provided script to download and extract data, just run the following command:

python main.py --download_data

Preprocessing

To preprocess data, you should determine which word embedding matrix you want to use. In default, with the data you have downloaded above, you can choose embedding matrices in one of {6B.50d, 6B.100d, 6B.200d, 6B.300d, 42B.300d} when running:

python main.py --preprocess_data --embedding=6B.300d

This step will takes several minutes. If you want to use other embedding matrices, download from https://nlp.stanford.edu/projects/glove/ and repeat above steps.

After preprocessing, all the preprocessed files will be in folder preprocessed, which consists of:

files premise_.txt, hypothesis_.txt, label_.txt: data in text format, each line corresponds to 1 sample.
files word.dict, label.dict and POS.dict: dictionaries for tokens in training set, label of classes, POS tags respectively.
files train.hdf5, test.hdf5, test.hdf5: 3 sets in the hdf5 format.

Train models

The main arguments of program are:

use_POS: use POS tags as an additional feature
model_type: which kind of model to use: ['attention', 'lstm', 'combine']
hidden_dim: size of hidden dimension, default 200
learning_rate: default 2e-4
dropout_rate: default 0.2
max_epochs: maximum epoch to train, default 100
gpu: use GPU to train

An example for training a self attention network, use POS tags, on GPU:

python main.py --gpu --model_type=attention --use_POS --hidden_dim=200 --max_epochs=100

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
LUS_report.pdf		LUS_report.pdf
README.md		README.md
dataset.py		dataset.py
main.py		main.py
model.py		model.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

LUS_report.pdf

LUS_report.pdf

README.md

README.md

dataset.py

dataset.py

main.py

main.py

model.py

model.py

preprocess.py

preprocess.py

requirements.txt

requirements.txt

train.py

train.py

Repository files navigation

Natural-Language-Inference

Project structure:

How to run

Download data

Preprocessing

Train models

License

About

Releases

Packages

Languages

License

nvnhat95/Natural-Language-Inference

Folders and files

Latest commit

History

Repository files navigation

Natural-Language-Inference

Project structure:

How to run

Download data

Preprocessing

Train models

License

About

Resources

License

Stars

Watchers

Forks

Languages