Neural-Machine-Translation

NLP Application Project

2.2.3 Build an NMT (Neural MT) system when training data (parallel sentences in the concerned source and target language) is available in a domain. However, such domain data is of small size. Machine learning is to be used in such a way that the small sized domain data can be combined with the large amount of general data.

Contributor:

Arushi Singhal 201516178
Simran Singhal 201516190

Presentation :- https://docs.google.com/presentation/d/1UgQXnST6rxZpctD8Atuaus7-2tdmhHMxCvMiZXemXck/edit?usp=sharing

Interim Report:- https://docs.google.com/document/d/1n1o2qPxLaCnB0E83i_ZiPZCA_8fN_uMCrQ-CQCzlql4/edit?usp=sharing

Report:- https://docs.google.com/document/d/10rAypGzTKjiJOw9Xe0qi9jYFTlNq8AQitohGlploK44/edit?usp=sharing

References

Hindi text Normalization

The IIT Bombay English-Hindi Parallel Corpus

https://www.cse.iitb.ac.in/~pb/papers/lrec18-iitbparallel.pdf

Document Link to the Errors found in the Dataset

https://docs.google.com/document/d/1zz67TTlVi0YuH7zUjD3up4O_7qKd8lCtElhxcH1bMWk/edit

Data Generator

https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly

Pytorch Neural Network Colab link got through flow group

https://colab.research.google.com/drive/1DgkVmi6GksWOByhYVQpyUB4Rk3PUq0Cp?fbclid=IwAR076PTAKeD99mN-htpMxCY4FaJNadF_OfCNry02rBwwixadJ-n1rygnW7I#scrollTo=6Q1AhoIB-pkp

Anaconda installation

https://www.digitalocean.com/community/tutorials/how-to-install-anaconda-on-ubuntu-18-04-quickstart

Multiple GPUs

https://www.pyimagesearch.com/2017/10/30/how-to-multi-gpu-training-with-keras-python-and-deep-learning/

Thesis work done for converting between Hindi to English on almost same size of data

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&uact=8&ved=2ahUKEwj72dn33fXhAhWNbn0KHYNnDNUQFjADegQIBBAC&url=http%3A%2F%2Fweb2py.iiit.ac.in%2Fresearch_centres%2Fpublications%2Fdownload%2Fmastersthesis.pdf.af2224b7bc18088c.4b756e616c2d5468657369732d46696e616c2e706466.pdf&usg=AOvVaw2PZO-pochZDvz7x-4t49pa

Researchgate for hindi to english machine translation

https://www.researchgate.net/publication/228783817_Machine_translation_of_bi-lingual_hindi-english_hinglish_text

Name		Name	Last commit message	Last commit date
Latest commit History 273 Commits
Jupyter_Notebbok		Jupyter_Notebbok
Python_codes		Python_codes
anoopkunchukuttan-indic_nlp_library-eccde81		anoopkunchukuttan-indic_nlp_library-eccde81
indic_nlp_resources-master		indic_nlp_resources-master
initial_data_preprocessing		initial_data_preprocessing
results		results
results_presentation/on_23_march		results_presentation/on_23_march
English		English
English_		English_
Hindi		Hindi
Hindi_		Hindi_
PPT.pdf		PPT.pdf
README.md		README.md
Report.pdf		Report.pdf
Running GRU		Running GRU
attention_decoder.py		attention_decoder.py
english-hindi-NMT-Pytorch-small.ipynb		english-hindi-NMT-Pytorch-small.ipynb
english-hindi-NMT-Pytorch.ipynb		english-hindi-NMT-Pytorch.ipynb
english-hindi-NMT-Pytorch.py		english-hindi-NMT-Pytorch.py
english-hindi-NMT-small-dataset-bidirectional-lstm.ipynb		english-hindi-NMT-small-dataset-bidirectional-lstm.ipynb
english-hindi-NMT-small-dataset.ipynb		english-hindi-NMT-small-dataset.ipynb
english-hindi-NMT.ipynb		english-hindi-NMT.ipynb
mispell_listname.txt		mispell_listname.txt
nmt_with_attention.ipynb		nmt_with_attention.ipynb
nmt_with_attention_seperate_training_testing.ipynb		nmt_with_attention_seperate_training_testing.ipynb
preprocessing.py		preprocessing.py
testing_english.txt		testing_english.txt
testing_hindi.txt		testing_hindi.txt
traing_english.txt		traing_english.txt
traing_hindi.txt		traing_hindi.txt

ArushiSinghal/Neural-Machine-Translation-English-Hindi-for-domain-data

Folders and files

Latest commit

History

Repository files navigation

Neural-Machine-Translation

References

Hindi text Normalization

The IIT Bombay English-Hindi Parallel Corpus

Document Link to the Errors found in the Dataset

Data Generator

Pytorch Neural Network Colab link got through flow group

Anaconda installation

Multiple GPUs

Thesis work done for converting between Hindi to English on almost same size of data

Researchgate for hindi to english machine translation

About

Resources

Stars

Watchers

Forks

Languages