A Structured Self-Attentive Sentence Embedding

Re-Implementation of A Structured Self-Attentive Sentence Embedding by Lin et al., 2017.

Results

Set	Loss	Accuracy
Training	1.136	77.26%
Validation	1.587	60.91%

Above results were obtained after training for 5 epochs. The training set contained 20000 examples and validation set 1000 examples. The model with the best validation loss was choosen. Note that the training set size in the paper is much bigger.

Data

The Yelp dataset can be download here. After downloading, the file only has to be unzipped.

Training

You can run the training procedure with the default settings with the following command:
python3 train.py --data-dir <dir of unzipped yelp data>

For more information about training settings run:
python3 train.py --help

Analysis & Visualization

Once the model is trained, the attention pattern can be visualized as done in the paper. The following python script will create an HTML file with the reviews and respective attention pattern. Also the confusion matrix for the classification will be created.
python3 viz.py --html --cm --data-dir <dir of unzipped yelp data> --validation-set <path to saved validation split>

Attention Pattern

Confusion Matrix

Differences with the paper

Adam instead of SGD
No gradient clipping
No dropout
No GLOVE word embedding initialization

Requirements

Implemented and tested with python 3.6.5
Python library versions can be found in requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
imgs		imgs
README.md		README.md
attention.py		attention.py
dataset.py		dataset.py
dictionary.py		dictionary.py
encoder.py		encoder.py
requirements.txt		requirements.txt
selfattentive.py		selfattentive.py
train.py		train.py
viz.py		viz.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imgs

imgs

README.md

README.md

attention.py

attention.py

dataset.py

dataset.py

dictionary.py

dictionary.py

encoder.py

encoder.py

requirements.txt

requirements.txt

selfattentive.py

selfattentive.py

train.py

train.py

viz.py

viz.py

Repository files navigation

A Structured Self-Attentive Sentence Embedding

Results

Data

Training

Analysis & Visualization

Attention Pattern

Confusion Matrix

Differences with the paper

Requirements

About

Releases

Sponsor this project

Packages

Contributors 2

Languages

timbmg/Structured-Self-Attentive-Sentence-Embedding

Folders and files

Latest commit

History

Repository files navigation

A Structured Self-Attentive Sentence Embedding

Results

Data

Training

Analysis & Visualization

Attention Pattern

Confusion Matrix

Differences with the paper

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

Languages