Decomposable Neural Network Models for Natural Language Inference

This code is a Tensorflow implementation of the models described in A Decomposable Attention Model for Natural Language Inference and Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference (for the latter, only the sequential model is implemented).

This architecture is composed of three main steps:

Align. This steps finds word-level alignments between the two sentences. The words in one sentence are compared to the words in the other one, possibly considering their contexts.
Compare. Each word is paired with a representation of the words it is aligned to. This representation is achieved by combining word embeddings, weighted by the strength of the alignment. Neural networks process these combinations.
Aggregate. All word-alignemnt pairs are combined for a final decision with respect to the relation between the two sentences.

Requirements

The code is written in Python 2.7; the main incompatibility with Python 3 currently is the module structure. It runs (at least) on tensorflow versions from 1.2 to 1.5.

Usage

Training

Run train.py -h to see an explanation of its usage. A lot of hyperparameter customization is possible; but as a reference, using the MLP model on SNLI, great results can be obtained with 200 units, 0.8 dropout keep probability (i.e., 0.2 dropout), 0 l2 loss, a batch size of 32, an initial learning rate of 0.05 and Adagrad.

The train and validation data should be in the JSONL format used in the SNLI corpus. The embeddings can be given in two different ways:

A text file where each line has a word and its vector with values separated by whitespace or tabs

(faster!) A numpy file with the saved embedding matrix and an extra text file with the vocabulary, such that its i-th line corresponds to the i-th row in the matrix.

The code can be run on either GPU or CPU transparently; it only depends on the tensorflow installation.

Running a trained model

In order to run a trained model interactively in the command line, use `interactive-eval.py`:

$ python src/interactive-eval.py saved-model/ glove-42B.npy --vocab glove-42B-vocabulary.txt
Reading model
Type sentence 1: The man is eating spaghetti with sauce.
Type sentence 2: The man is having a meal.
Model answer: entailment

Type sentence 1: The man is eating spaghetti with sauce.
Type sentence 2: The man is running in the park.
Model answer: contradiction

Type sentence 1: The man is eating spaghetti with sauce.
Type sentence 2: The man is eating in a restaurant.
Model answer: neutral

It can also show a heatmap of the alignments.

Evaluation

Use the script evaluate.py to obtain a model's loss, accuracy and optionally see the misclassified pairs.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
alignments.png		alignments.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.rst

README.rst

alignments.png

alignments.png

Repository files navigation

Decomposable Neural Network Models for Natural Language Inference

Requirements

Usage

Training

Running a trained model

Evaluation

About

Releases

Packages

Languages

License

erickrf/multiffn-nli

Folders and files

Latest commit

History

Repository files navigation

Decomposable Neural Network Models for Natural Language Inference

Requirements

Usage

Training

Running a trained model

Evaluation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages