Decomposable Attention Model for Sentence Pair Classification

Implementation of the paper A Decomposable Attention Model for Natural Language Inference. Parikh et al. EMNLP 2016.

The same model can be used for generic sentence pair classification tasks (e.g. paraphrase detection), in addition to natural language inference.

Data

Stanford Natural Language Inference (SNLI) dataset can be downloaded from http://nlp.stanford.edu/projects/snli/

Pre-trained GloVe embeddings can be downloaded from http://nlp.stanford.edu/projects/glove/

Preprocessing

First we need to process the SNLI data:

python process-snli.py --data_filder path-to-snli-folder --out_folder path-to-output-folder

Then run:

python preprocess-entail.py --srcfile path-to-sent1-train --targetfile path-to-sent2-train
--labelfile path-to-label-train --srcvalfile path-to-sent1-val --targetvalfile path-to-sent2-val
--labelvalfile path-to-label-val --srctestfile path-to-sent1-test --targettestfile path-to-sent2-test
--labeltestfile path-to-label-test --outputfile data/entail --glove path-to-glove

Here path-to-sent1-train is the path to the src-train.txt file created from running process-snli.py (and path-to-sent2-train = targ-train.txt, path-to-label-train = label-train.txt, etc.)

preprocess-entail.py will create the data hdf5 files. Vocabulary is based on the pretrained Glove embeddings, with path-to-glove being the path to the pretrained Glove word vecs (i.e. the glove.840B.300d.txt file).

For SNLI sent1 is the premise and sent2 is the hypothesis.

Now run:

python get_pretrain_vecs.py --glove path-to-glove --outputfile data/glove.hdf5
--dictionary path-to-dict

path-to-dict is the *.word.dict file created from running preprocess.py

Training

To train the model, run

th train.lua -data_file path-to-train -val_data_file path-to-val -test_data_file path-to-test
-pre_word_vecs path-to-word-vecs

Here path-to-word-vecs is the hdf5 file created from running get_pretrain_vecs.py.

You can add -gpuid 1 to use the (first) GPU.

The model essentially replicates the results of Parikh et al. (2016). The main difference is that they use asynchronous updates, while this code uses synchronous updates.

Predicting

To predict on new data, run

th predict.lua -sent1_file path-to-sent1 -sent2_file path-to-sent2 -model path-to-model
-word_dict path-to-word-dict -label_dict path-to-label-dict -output_file pred.txt

This will output the predictions to pred.txt. path-to-word-dict and path-to-label-dict are the *.dict files created from running preprocess.py

Contact

Written and maintained by Yoon Kim.

Licence

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.lua		data.lua
get_pretrain_vecs.py		get_pretrain_vecs.py
models.lua		models.lua
predict.lua		predict.lua
preprocess.py		preprocess.py
process-snli.py		process-snli.py
train.lua		train.lua
utils.lua		utils.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

data.lua

data.lua

get_pretrain_vecs.py

get_pretrain_vecs.py

models.lua

models.lua

predict.lua

predict.lua

preprocess.py

preprocess.py

process-snli.py

process-snli.py

train.lua

train.lua

utils.lua

utils.lua

Repository files navigation

Decomposable Attention Model for Sentence Pair Classification

Data

Preprocessing

Training

Predicting

Contact

Licence

About

Releases

Packages

Contributors 3

Languages

License

harvardnlp/decomp-attn

Folders and files

Latest commit

History

Repository files navigation

Decomposable Attention Model for Sentence Pair Classification

Data

Preprocessing

Training

Predicting

Contact

Licence

About

Resources

License

Stars

Watchers

Forks

Languages