[won't merge - v1 codebase] Bert #1543

Zenglinxiao · 2019-08-26T13:08:10Z

Add BERT into OpenNMT-py.

francoishernandez

Quick review of FAQ.md

docs/source/FAQ.md

bert_ckp_convert.py

onmt/train_single.py

onmt/trainer.py

onmt/train_single.py

onmt/translate/predictor.py

minor changes to make code simpler/more explicit

vince62s · 2019-08-26T17:55:22Z

@Zenglinxiao don't you think it could be possible to embed "bert_build_model" within "build_model" ?

rajarsheem · 2019-08-30T11:25:05Z

Do you have some comparison between BERT and w/o BERT on NMT tasks?

francoishernandez · 2019-08-30T12:04:11Z

@rajarsheem this is not the aim of BERT or this PR. You have some examples of what it's for in the FAQ.md.

guillaumekln · 2019-08-30T16:12:15Z

onmt/encoders/bert.py

+
+class BertLayerNorm(nn.Module):
+    def __init__(self, hidden_size, eps=1e-12):
+        """Layernorm module in the TF style(epsilon inside the square root).


Does PyTorch implement it differently?

I adapt the BertLayerNorm from huggingface/pytorch-transformers as it indicate "epsilon inside the square root", and i checked Pytorch doc, it use eps=1e-5, so i kept that. Thanks to your remind, I find it's the same as i go deep into pytorch cpp code. The origin implementation is due to the typo of Pytorch doc which is fix already (huggingface/transformers#1089).
I'll switch that!

Please check:
BertLayerNorm switched to pytorch original LayerNorm.
BertAdam switched into AdamW as well but add option correct_bias to not compensate for bias, as in original BERT.

…ith option correct_bias

kagrze · 2019-10-21T08:48:36Z

onmt/model_builder.py

@@ -191,40 +213,62 @@ def build_base_model(model_opt, fields, gpu, checkpoint=None, gpu_id=None):
        pad_idx = tgt_base_field.vocab.stoi[tgt_base_field.pad_token]
        generator = CopyGenerator(model_opt.dec_rnn_size, vocab_size, pad_idx)

+    if model_opt.is_bert:
+        model = encoder


Why the encoder becomes the whole model? If there is no decoder, then how can I use the model for machine translation?

BERT stands for Bidirectional Encoder Representations from Transformers. It is not a machine translation model.
https://arxiv.org/abs/1810.04805

I know this :) But this is a PR to a machine translation system. If the machine translation cannot benefit from this PR then why bother? If you want to use BERT for basic downstream tasks (e.g. classification), then you can use the Hugging Face's implementation or the reference implementation.

OpenNMT-py is not only machine translation, there are other tasks such as Speech2Text, Image2Text, Vid2Text, etc.
This PR is meant to add BERT task(s) and components to the framework to allow users to experiment. Nothing stops you to try and build a custom model using BERT pre-trained encoder layers and standard Transformer decoder layers for instance, but I think it has been tried without great benefits.

Hmm, but AFAIK, all the tasks you mentioned (i.e. Speech2Text, Image2Text, Video2Text) are formalised as transduction problems and, therefore, require some sort of a decoder.

Sure. All those things are not mutually exclusive.
Feel free to share any research on the forum and eventually PR if you have anything interesting regarding using BERT in a seq2seq task.

Simons2017 · 2019-12-05T09:39:14Z

I would like to using BERT in a seq2seq task, but I'm confused about how to preprocess data. Would you like to give me some advice?

Zenglinxiao added 18 commits July 18, 2019 11:54

Bert init commit

fd8ac2e

support file

7601a88

activation function

1c0498e

bert dataset

de3ca85

add a new way of using bert

ede0250

merge some function

1dfa50a

adapt BERT related module to ONMT habit

c1dd1f9

add downsteam task support

8c3436f

update

12a909a

update

ea14b13

fix bug; add new feature

3fae446

add prediction file

4b511e4

clean up code

7f6a127

tagging bug fix

892e0a0

clean code

ed0cf4d

Merge branch 'master' of https://github.com/OpenNMT/OpenNMT-py into bert

dde55e6

Fix flake8

6c5ec3a

solve PR check

ba8a358

francoishernandez reviewed Aug 26, 2019

View reviewed changes

pltrdy reviewed Aug 26, 2019

View reviewed changes

pltrdy and others added 2 commits August 26, 2019 19:15

minor changes to make code simpler/more explicit

08b1080

Merge pull request #1 from pltrdy/bert

b317ecf

minor changes to make code simpler/more explicit

Zenglinxiao added 4 commits August 27, 2019 14:29

simplify code

660e459

fix import; clarify FAQ

e5b0355

fix build

1a676b2

fix exception

4335a13

vince62s mentioned this pull request Aug 28, 2019

do you have a plan to implement BERT in your model? #1040

Closed

guillaumekln reviewed Aug 30, 2019

View reviewed changes

Zenglinxiao added 3 commits September 2, 2019 12:05

switch BertLayerNorm to offical LayerNorm, change BertAdam to AdamW w…

f5aec9f

…ith option correct_bias

fix bert valid step, remove unuse part in saver

4938a93

add dynamic batchingwhen inference

b1658f5

kagrze reviewed Oct 21, 2019

View reviewed changes

Zenglinxiao added 5 commits November 19, 2019 10:27

update classifier with confiance option

9b1abd2

merge recent change on master

2e7e8d1

rm tailing space

e352a94

merge recent update from master

9d655fd

fix travis

6c8e8e6

francoishernandez mentioned this pull request Jan 22, 2020

BERT implementation #1708

Closed

vince62s changed the title ~~Bert~~ [won't merge - v1 codebase] Bert Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[won't merge - v1 codebase] Bert #1543

[won't merge - v1 codebase] Bert #1543

Zenglinxiao commented Aug 26, 2019

francoishernandez left a comment

vince62s commented Aug 26, 2019

rajarsheem commented Aug 30, 2019

francoishernandez commented Aug 30, 2019

guillaumekln Aug 30, 2019

Zenglinxiao Aug 30, 2019

Zenglinxiao Sep 2, 2019 •

edited

kagrze Oct 21, 2019

francoishernandez Oct 21, 2019 •

edited

kagrze Oct 21, 2019 •

edited

francoishernandez Oct 21, 2019

kagrze Oct 21, 2019

francoishernandez Oct 21, 2019

Simons2017 commented Dec 5, 2019

[won't merge - v1 codebase] Bert #1543

Are you sure you want to change the base?

[won't merge - v1 codebase] Bert #1543

Conversation

Zenglinxiao commented Aug 26, 2019

francoishernandez left a comment

Choose a reason for hiding this comment

vince62s commented Aug 26, 2019

rajarsheem commented Aug 30, 2019

francoishernandez commented Aug 30, 2019

guillaumekln Aug 30, 2019

Choose a reason for hiding this comment

Zenglinxiao Aug 30, 2019

Choose a reason for hiding this comment

Zenglinxiao Sep 2, 2019 • edited

Choose a reason for hiding this comment

kagrze Oct 21, 2019

Choose a reason for hiding this comment

francoishernandez Oct 21, 2019 • edited

Choose a reason for hiding this comment

kagrze Oct 21, 2019 • edited

Choose a reason for hiding this comment

francoishernandez Oct 21, 2019

Choose a reason for hiding this comment

kagrze Oct 21, 2019

Choose a reason for hiding this comment

francoishernandez Oct 21, 2019

Choose a reason for hiding this comment

Simons2017 commented Dec 5, 2019

Zenglinxiao Sep 2, 2019 •

edited

francoishernandez Oct 21, 2019 •

edited

kagrze Oct 21, 2019 •

edited