Self-Attentive Sentence Embedding

This is a PyTorch implementation of A structured self-attentive sentence embedding by Lin et al 2017. This approach has been applied to Author profiling PAN 2015 and 2016 tasks. The data can be obtained from the above links. This implementation handles gender and age group classification.

The approach uses 100-dimensional Glove word embeddings to initialize the word embedding layer.

The program can be executed by

python main.py --input ./data --expt self-attn-gender --attr gender

Parameters:

--input - Input path with

--results - Directory to store models and results

--expt - Experiment name

--wordemb - Word embeddings (100-dim Glove embeddings)

--batchsz - Batch size

--nepoch - Number of epochs

--embedsz - Word embedding size

--hiddensz - Hidden layer size

--nlayers - Number of hidden layers

--attnsz - Number of attention units (d_a)

--attnhops - Number of attention hops (r)

--fcsize - Fully connected layer size

--attr - Attribute to profile (gender or age group)

--lr - Learning rate

Salient features

Features that were found salient by the attention layer for different social groups

Female

Male

Ages 18-24

Ages 50+

Reference:

Lin, Z., Feng, M., Santos, C. N. D., Yu, M., Xiang, B., Zhou, B., & Bengio, Y. (2017). A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
code		code
figures		figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

figures

figures

README.md

README.md

Repository files navigation

Self-Attentive Sentence Embedding