Skip to content

Neural Machine Translator for translating from english to hindi text. Used Pytorch framework with seq2seq architecture having Attention functionality

License

Notifications You must be signed in to change notification settings

vermasrijan/Neural_Machine_Translator_seq2seq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Neural_Machine_Translator_seq2seq

Neural Machine Translator (NMT) for translating from english to hindi text. Used Pytorch framework with seq2seq architecture having Attention functionality .
The Jupyter Notebook given in this repository is self explanatory and well documented.

Dependencies

Pytorch == 0.3.0
Numpy == 1.14.2

This blog explains NMT really well !

Dataset

There are various sources from where you can download the eng-hind.txt parallel corpus : -

  1. IIT-Bombay Dataset
  2. HindEnCorp 0.5
  3. Indian parallel Corpora

The dataset file should be a tab seperated file having text in the following way -
I am cold.                मुझे ठंड लग रही है।
My name is yash    मेरा नाम यश है
.                                .
.                                .

The jupyter notebook given here is for educational purpose, and if you wish to see some good results then I would highly recommend you to git clone one of the following repositories -

1.Stanford NMT [Matlab]
2.tf-seq2seq [TensorFlow]
3.Nemantus [Theano]
4.OpenNMT [Torch with Lua Language]---> Highly recommended, incorporates all the functionalities
5.OpenNMT-py [PyTorch]

Papers

A Statistical Approach to Machine Translation, 1990.
Review Article: Example-based Machine Translation, 1999.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 2014.
Neural Machine Translation by Jointly Learning to Align and Translate, 2014.
Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, 2016.
Sequence to sequence learning with neural networks, 2014.
Recurrent Continuous Translation Models, 2013.
Continuous space translation models for phrase-based statistical machine translation, 2013.

Acknowledgements

A big Thank you to the whole team of Messy Fractals, especially Dhanya P and Arvind Sivdas for letting me work under them, for this project .

References

The credits for this code go to the user spro. I have merely made some changes in it for dealing with Hindi text.

About

Neural Machine Translator for translating from english to hindi text. Used Pytorch framework with seq2seq architecture having Attention functionality

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published