GitHub - franckbrl/nmt-pseudo-source-discriminator: Neural Machine Translation with Pseudo-Source Discriminator

NMT-PSEUDO-SOURCE-DISCRIMINATOR

Neural Machine Translation with Pseudo-Source Discriminator.

This implementation is based on the Theano version of Nematus.

The Machine Translation model is trained on natural parallel data (the source is a human translation of the target), as well as pseudo-parallel data (the source is an automatic translation or a copy of the target). These two data types have a separate encoder and are used in a Generative Adversarial Network scenario. The output of both encoders is fed to a discriminator that is optimized to distinguish the encoding of natural and pseudo sources. The discriminator feedback is used to optimize the pseudo-source encoder.

INSTALLATION

See the Nematus repository.

USAGE INSTRUCTIONS

Execute nmt.py to train a model.

Additional arguments

The arguments are the same as in Nematus, augmented with the following:

parameter	description
--pseudo_data	parallel training corpus with pseudo-source
--pretrain_pseudo_src	pretrain pseudo source encoder before NMT training starts
--generator_start_uidx	update number to start generator training (default: 10000)
--nmt_start_uidx	update number to start NMT training (default: 10000)
--pseudo_src_noise	introduce noise in pseudo source (drop words and make permutations)
--d_lrate	learning rate for pseudo-source discriminator (default: 1e-05)
--g_lrate	learning rate for pseudo-source generator (default: 1e-05)

Inference

Inference is run just like in Nematus.

PUBLICATIONS

Franck Burlot and François Yvon, Using Monolingual Data in Neural Machine Translation: a Systematic Study. In Proceedings of the Third Conference on Machine Translation (WMT’18). Association for Computational Linguistics, Brussels, Belgium, 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
metrics		metrics
server		server
README.md		README.md
__init__.py		__init__.py
alignment_util.py		alignment_util.py
compat.py		compat.py
data_iterator.py		data_iterator.py
domain_interpolation_data_iterator.py		domain_interpolation_data_iterator.py
hypgraph.py		hypgraph.py
initializers.py		initializers.py
layers.py		layers.py
nmt.py		nmt.py
optimizers.py		optimizers.py
pseudo_source_data_iterator.py		pseudo_source_data_iterator.py
raml_distributions.py		raml_distributions.py
rescore.py		rescore.py
sample_client.py		sample_client.py
score.py		score.py
server.py		server.py
settings.py		settings.py
shuffle.py		shuffle.py
theano_util.py		theano_util.py
training_progress.py		training_progress.py
translate.py		translate.py
util.py		util.py

franckbrl/nmt-pseudo-source-discriminator

Folders and files

Latest commit

History

Repository files navigation

NMT-PSEUDO-SOURCE-DISCRIMINATOR

INSTALLATION

USAGE INSTRUCTIONS

Additional arguments

Inference

PUBLICATIONS

About

Resources

Stars

Watchers

Forks

Languages