Skip to content

Latest commit

 

History

History
45 lines (36 loc) · 3.53 KB

machine_translation.md

File metadata and controls

45 lines (36 loc) · 3.53 KB

Machine translation

Machine translation is the task of translating a sentence in a source language to a different target language.

Results with a * indicate that the mean test score over the the best window based on average dev-set BLEU score over 21 consecutive evaluations is reported as in Chen et al. (2018).

WMT 2014 EN-DE

Models are evaluated on the English-German dataset of the Ninth Workshop on Statistical Machine Translation (WMT 2014) based on BLEU.

Model BLEU Paper / Source
Transformer Big + BT (Edunov et al., 2018) 35.0 Understanding Back-Translation at Scale
DeepL 33.3 DeepL Press release
MUSE (Zhao et al., 2019) 29.9 MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
DynamicConv (Wu et al., 2019) 29.7 Pay Less Attention With Lightweight and Dynamic Convolutions
AdvSoft + Transformer Big (Wang et al., 2019) 29.52 Improving Neural Language Modeling via Adversarial Training
Transformer Big (Ott et al., 2018) 29.3 Scaling Neural Machine Translation
RNMT+ (Chen et al., 2018) 28.5* The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
Transformer Big (Vaswani et al., 2017) 28.4 Attention Is All You Need
Transformer Base (Vaswani et al., 2017) 27.3 Attention Is All You Need
MoE (Shazeer et al., 2017) 26.03 Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
ConvS2S (Gehring et al., 2017) 25.16 Convolutional Sequence to Sequence Learning

WMT 2014 EN-FR

Similarly, models are evaluated on the English-French dataset of the Ninth Workshop on Statistical Machine Translation (WMT 2014) based on BLEU.

Model BLEU Paper / Source
DeepL 45.9 DeepL Press release
Transformer Big + BT (Edunov et al., 2018) 45.6 Understanding Back-Translation at Scale
MUSE (Zhao et al., 2019) 43.5 MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
DynamicConv (Wu et al., 2019) 43.2 Pay Less Attention With Lightweight and Dynamic Convolutions
Transformer Big (Ott et al., 2018) 43.2 Scaling Neural Machine Translation
RNMT+ (Chen et al., 2018) 41.0* The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
Transformer Big (Vaswani et al., 2017) 41.0 Attention Is All You Need
MoE (Shazeer et al., 2017) 40.56 Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
ConvS2S (Gehring et al., 2017) 40.46 Convolutional Sequence to Sequence Learning
Transformer Base (Vaswani et al., 2017) 38.1 Attention Is All You Need

Go back to the README