Structured Self-Attentive Sentence Embedding

This is an implementation of the paper: https://arxiv.org/pdf/1703.03130.pdf published in ICLR 2017. This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, the paper use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence.It also propose a self-attention mechanism.

The implementation is done on the imdb dataset with the following parameters:

top_words = 10000
learning_rate =0.001
max_seq_len = 200
emb_dim = 300
batch_size=500
u=64
da = 32
r= 16

top_words : only consider the top 10,000 most common words
u: hidden unit number for each unidirectional LSTM
da : a hyperparameter we can set arbitrarily.
r : no. of different parts to be extracted from the sentence.

To Run:

python self-attention.py

Running this for 4 epochs gives a training accuracy of 94% and test accuracy of 87%.

To Do :

Penalization term
results on other datasets

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
self-attention.png		self-attention.png
self-attention.py		self-attention.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

self-attention.png

self-attention.png

self-attention.py

self-attention.py

Repository files navigation

Structured Self-Attentive Sentence Embedding

To Run:

To Do :

About

Releases

Packages

Languages

PrashantRanjan09/Structured-Self-Attentive-Sentence-Embedding

Folders and files

Latest commit

History

Repository files navigation

Structured Self-Attentive Sentence Embedding

To Run:

To Do :

About

Topics

Resources

Stars

Watchers

Forks

Languages