Hierarchical Attention Model for Question Answering

An implementation of the Hierachical Attention Model and the baselines described in the paper Hierarchical Attention Model for Improved Comprehension of Spoken Content by Wei Fang, Juei-Yang Hsu, Hung-Yi Lee, and Lin-Shan Lee.

Requirements

Torch7
penlight
nn
nngraph
optim
rnn
Java >= 8 (for Stanford CoreNLP utilities)
Python2 >= 2.7
Python3 >= 3.5

The Torch/Lua dependencies can be installed using luarocks. For example:

luarocks install nngraph

Usage

First run the following script:

sh preprocess.sh

This downloads the following data:

TOEFL Listening Comprehension Test Dataset
Glove word vectors (Common Crawl 840B) -- Warning: this is a 2GB download!

and the following libraries:

The preprocessing script generates dependency parses of the TOEFL dataset using the Stanford Neural Network Dependency Parser.

Alternatively, the download and preprocessing scripts can be called individually.

TOEFL Listening Comprehension Test

TOEFL is an English examination which tests knowledge and skills of academic English for non-native English learners. Each example consists of an audio story, a question, and four answer choices. Among these choices, one or two of them are correct. given the manual or ASR transcriptions of an audio story and a question, machine has to select the correct answer out of the four choices.

To train models for the TOEFL Listening Comprehension Test, run:

th toefl/main.lua --model <ham|lstm|bilstm|treelstm|memn2n> --task <manual|ASR> --level <phrase|sentence> --dim <sentence_representation_dim> --internal <memn2n_dim> --hops <memn2n_hops> --layers <num_layers> --epochs <num_epochs> --prune <pruning_rate>

where:

model: the model to train (default: ham, i.e. the Hierarchical Attention Model)
task: the transcription to be trained on (default: manual)
level: the attention level of the HAM (default: phrase, ignored for other models)
dim: the dimension for sentence/phrase representations (default: 75)
internal: the dimension for memory module in HAM or for MemN2N (default: 75, ignored for other models)
hops: the number of hops for HAM or MemN2N (default: 1, ignored for other models)
layers: the number of layers for LSTM or BiLSTM (default: 1, ignored for other models)
epochs: the number of training epochs (default: 10)
prune: the preprocessing prune rate (default: 1, i.e. no pruning)

Trained model parameters are saved to the trained_models directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layers

layers

lib

lib

models

models

scripts

scripts

toefl

toefl

util

util

.gitignore

.gitignore

README.md

README.md

init.lua

init.lua

preprocess.sh

preprocess.sh

Repository files navigation

Hierarchical Attention Model for Question Answering

Requirements

Usage

TOEFL Listening Comprehension Test

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
layers		layers
lib		lib
models		models
scripts		scripts
toefl		toefl
util		util
.gitignore		.gitignore
README.md		README.md
init.lua		init.lua
preprocess.sh		preprocess.sh

sunprinceS/Hierarchical-Attention-Model

Folders and files

Latest commit

History

Repository files navigation

Hierarchical Attention Model for Question Answering

Requirements

Usage

TOEFL Listening Comprehension Test

About

Topics

Resources

Stars

Watchers

Forks

Languages