Skip to content

PrashantRanjan09/Structured-Self-Attentive-Sentence-Embedding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Structured Self-Attentive Sentence Embedding

This is an implementation of the paper: https://arxiv.org/pdf/1703.03130.pdf published in ICLR 2017. This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, the paper use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence.It also propose a self-attention mechanism.

Optional Text

The implementation is done on the imdb dataset with the following parameters:

top_words = 10000
learning_rate =0.001
max_seq_len = 200
emb_dim = 300
batch_size=500
u=64
da = 32
r= 16

top_words : only consider the top 10,000 most common words
u: hidden unit number for each unidirectional LSTM
da : a hyperparameter we can set arbitrarily.
r : no. of different parts to be extracted from the sentence.

To Run:

python self-attention.py

Running this for 4 epochs gives a training accuracy of 94% and test accuracy of 87%.

To Do :

Penalization term
results on other datasets

About

Implementation of the Paper Structured Self-Attentive Sentence Embedding published in ICLR 2017

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages