Word-level-language-modeling

Benchmarking tricks to improve a base RNN word-level language model.

We implement the following tricks:

early-stopping
learning rate decay: annealing the learning rate when validation perplexity starts increasing
dropout rates: at the input, and variational dropout (https://papers.nips.cc/paper/6241-a-theoretically-grounded-application-of-dropout-in-recurrent-neural-networks.pdf)
tying the encoder weights (=the word embeddings matrix) with the decoder weights (=the softmax output matrix) as in https://arxiv.org/pdf/1611.01462.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
__pycache__		__pycache__
data		data
README.md		README.md
data.py		data.py
main.py		main.py
models.py		models.py
nets.py		nets.py
utils.py		utils.py

Provide feedback