attentionMNISTseq2seq

Use encoder-decoder with Bahdanau attention to predict MNIST digits sequence

Software environment:

Ubuntu 16.04 x64
Python 3.5
TensorFlow 1.3

Structure:

main.py The main file of the encoder-decoder model with Bahdanau attention (Figure 1)
mnist.py The MNIST digits sequence generator, after which you can get sequence datasets (Figure 2)
tasas_cer.sh The tool to calculate character error rate based on https://github.com/mauvilsa/htrsh
pytasas.py The python wrapper to access the CER calculated by the shell above
drawCER.py Display the CER for both training data and testing data
drawLoss.py Display the Loss for both training data and testing data
clear.sh Clean the pred_logs directory

Figure 1. Encoder-decoder model with Bahdanau attention

Figure 2. Generated digits sequence based on MNIST

Results:

I have tested the model with and without Bahdanau attention, the results are as below. Apparently the attention mechanism improves the prediction, but there is another thing I need to note here. The difference between Figure 4 and Figure 5 is whether using the true label as decoder input or using the predicted label generated by itself at previous timestep. In the figures, we can tell the latter one performs better. Actually, this tricky way is a simplified method related to Scheduled Sampling[1], using which I think the result would be even better.

Figure 3. CER and LOSS of basic encoder-decoder model

Figure 4. CER and LOSS of encoder-decoder model with Bahdanau attention

Figure 5. CER and LOSS of encoder-decoder model with Bahdanau attention using self-predicted value as decoder input

References:

[1] Bengio, Samy, et al. "Scheduled sampling for sequence prediction with recurrent neural networks." Advances in Neural Information Processing Systems. 2015.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pred_logs

pred_logs

.gitignore

.gitignore

README.md

README.md

clear.sh

clear.sh

drawCER.py

drawCER.py

drawLoss.py

drawLoss.py

main.py

main.py

mnist.py

mnist.py

pytasas.py

pytasas.py

tasas_cer.sh

tasas_cer.sh

Repository files navigation

attentionMNISTseq2seq

Software environment:

Structure:

Results:

References:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
pred_logs		pred_logs
.gitignore		.gitignore
README.md		README.md
clear.sh		clear.sh
drawCER.py		drawCER.py
drawLoss.py		drawLoss.py
main.py		main.py
mnist.py		mnist.py
pytasas.py		pytasas.py
tasas_cer.sh		tasas_cer.sh

leitro/attentionMNISTseq2seq

Folders and files

Latest commit

History

Repository files navigation

attentionMNISTseq2seq

Software environment:

Structure:

Results:

References:

About

Topics

Resources

Stars

Watchers

Forks

Languages