Steps

I implement encoder-decoder based seq2seq models with attention. The encoder can be a Bidirectional LSTM, a simple LSTM or a GRU, and the decoder can be an LSTM, or a GRU. I have a argument for encoder type (RNN model used in encoder); it can be 'bidirectional', 'lstm' or 'gru'. When this argument is set to 'bidirectional', the model has Bidirectional LSTM as the enocder a simple LSTM as the decoder. When it is set to 'lstm', the encoder and decoder are both simple LSTMs, and for the 'gru' value, they are both GRUs. Thus, I can have different three models.

To translate a sentence from a language to another one, a human translator reads the sentence part by part, and generates part of translation. A neural machine translation with attention like a human translator looks at the sentence part by part. To generate each part of translation, the attention mechanism tells a Neural Machine Translation model where it should pay attention to. A simple encoder-decoder model without the attention mechanism tends to forget the earlier part of the sequence once they process further. With the attention mechanism, the model can deal with long sequences.

Dataset

To evaluate the models, I use English-French dataset provided by http://www.manythings.org/anki/

Experiment

I computed accuracy and loss on both training and validation set on all of these three models and compared the resutls. The experiments show that the model with a Bidirectional LSTM as the encoder outperforms the rest.

NMT with Bidirectional LSTM has lowest loss during 20 epochs	NMT with Bidirectional LSTM has highest accuracy during 20 epochs

How to run:

git clone https://github.com/mshadloo/Neural-Machine-Translation-with-Attention.git
cd Neural-Machine-Translation-with-Attention
chmod +x data.sh && ./data.sh
chmod +x run.sh && ./run.sh

Steps

Data Preprocessing

First of all, like any other NLP task, we load the text data and perform pre-processing and also do a train-test split.

The data needs some cleaning before being used to train our neural translation model.

Normalizing case to lowercase.
Removing punctuation from each word.
Removing non-printable characters.
Converting French characters to Latin characters.
Removing words that contain non-alphabetic characters.
Add a special token <eos> at the end of target sentences
Create two dictionaries mapping from each word in vocabulary to an id, and the id to the word.
Mark all out of vocabulary (OOV) words with a special token <unk>
Pad each sentence to a maximum length by adding special token <pad> at the end of the sentence.
Convert each sentence to its feature vector

Define The Model

I implement encoder-decoder based seq2seq models with attention. The encoder and the decoder are pre-attention and post-attention RNNs on both sides of the attention mechanism.

Encoder:a RNN (Bidirectional LSTM, LSTM, GRU)
- The encoder goes through 𝑇𝑥 time steps (𝑇𝑥: maximum length of the input sequence).
Decoder: a RNN (LSTM, GRU)
- The decoder goes through 𝑇𝑦 time steps (𝑇𝑦: maximum length of the output sequence).
The attention mechanism computes the context variable 𝑐𝑜𝑛𝑡𝑒𝑥𝑡⟨𝑡⟩ for each timestep in the output ( 𝑡=1,…,𝑇𝑦 ).

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
images		images
English_French_NMT.ipynb		English_French_NMT.ipynb
README.md		README.md
data.sh		data.sh
main.py		main.py
neuralMT.py		neuralMT.py
run.sh		run.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

English_French_NMT.ipynb

English_French_NMT.ipynb

README.md

README.md

data.sh

data.sh

main.py

main.py

neuralMT.py

neuralMT.py

run.sh

run.sh

utils.py

utils.py

Repository files navigation

Dataset

Experiment

How to run:

Steps

Data Preprocessing

Define The Model

About

Releases

Packages

Languages

mshadloo/Neural-Machine-Translation-with-Attention

Folders and files

Latest commit

History

Repository files navigation

Dataset

Experiment

How to run:

Steps

Data Preprocessing

Define The Model

About

Topics

Resources

Stars

Watchers

Forks

Languages