Neural Machine Translation in PyTorch

Ken Sible | NLP Group | University of Notre Dame

Note, any option in config.toml can also be passed as a command line argument,

$ python translate.py --model model.deen.pt --beam-size 10 "Guten Tag!"

and any output from stdout can be diverted using the output redirection operator.

$ python translate.py --model model.deen.pt --file input.de > output.en

Train Model

usage: main.py [-h] --lang LANG LANG --data FILE --test FILE --vocab FILE --codes FILE --model FILE --config FILE --log FILE [--seed SEED] [--tqdm]

options:
  -h, --help        show this help message and exit
  --lang LANG LANG  source/target language
  --data FILE       training data
  --test FILE       validation data
  --vocab FILE      vocab file (shared)
  --codes FILE      codes file (shared)
  --model FILE      model file (.pt)
  --config FILE     config file (.toml)
  --log FILE        log file (.log)
  --seed SEED       random seed
  --tqdm            import tqdm

Score Model

usage: score.py [-h] --data FILE --model FILE [--tqdm]

options:
  -h, --help    show this help message and exit
  --data FILE   testing data
  --model FILE  model file (.pt)
  --tqdm        import tqdm

Translate Input

usage: translate.py [-h] --model FILE (--string STRING | --file FILE)

options:
  -h, --help       show this help message and exit
  --model FILE     model file (.pt)
  --string STRING  input string
  --file FILE      input file

Model Configuration (Default)

embed_dim           = 512   # dimensions of embedding sublayers
ff_dim              = 2048  # dimensions of feed-forward sublayers
num_heads           = 8     # number of parallel attention heads
dropout             = 0.1   # dropout for emb/ff/attn sublayers
num_layers          = 6     # number of encoder/decoder layers
max_epochs          = 250   # maximum number of epochs, halt training
lr                  = 3e-4  # learning rate (step size of the optimizer)
patience            = 3     # number of epochs tolerated w/o improvement
decay_factor        = 0.8   # if patience reached, lr *= decay_factor
min_lr              = 5e-5  # minimum learning rate, halt training
label_smoothing     = 0.1   # label smoothing (regularization technique)
clip_grad           = 1.0   # maximum allowed value of gradients
batch_size          = 4096  # number of tokens per batch (source/target)
max_length          = 512   # maximum sentence length (during training)
beam_size           = 4     # beam search decoding (length normalization)

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.gitignore		.gitignore
README.md		README.md
config.toml		config.toml
decoder.py		decoder.py
environment.yml		environment.yml
layers.py		layers.py
main.py		main.py
manager.py		manager.py
model.py		model.py
preprocess.sh		preprocess.sh
score.py		score.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

config.toml

config.toml

decoder.py

decoder.py

environment.yml

environment.yml

layers.py

layers.py

main.py

main.py

manager.py

manager.py

model.py

model.py

preprocess.sh

preprocess.sh

score.py

score.py

translate.py

translate.py

Repository files navigation

Neural Machine Translation in PyTorch

Train Model

Score Model

Translate Input

Model Configuration (Default)

About

Releases

Packages

Languages

kennethsible/translation

Folders and files

Latest commit

History

Repository files navigation

Neural Machine Translation in PyTorch

Train Model

Score Model

Translate Input

Model Configuration (Default)

About

Topics

Resources

Stars

Watchers

Forks

Languages