Can not get the result as the paper if train the transformer from scratch. #30

tomshalini · 2021-05-18T20:22:23Z

Hello,

I have trained the transformer from scratch for WMT en-fr. I followed the instruction as per the guidelines. However, I can not get good results as compared to pretrained model mentioned in the repository.

Result of Model (Trained from scratch) :
BLEU4 = 2.00, 19.9/2.8/0.8/0.3 (BP=1.000, ratio=0.965, syslen=79863, reflen=82793)

Result of Pretrained model:
BLEU4 = 35.70, 64.6/41.9/29.1/20.6 (BP=1.000, ratio=0.990, syslen=81934, reflen=82793)

Attached is the training log.
17may_train_transformers_adam_resume_epoch16.txt

Could you please have a look on logs and help me to regenrate results as per paper?

Michaelvll · 2021-05-19T03:12:07Z

Hi,

Thank you for asking! I am not sure what causes the problem without the training command you use. Maybe you could check the training data and validation data to see if there is any mistake during the preprocessing. Also, please have a look at the generated prediction during validating and testing to see if there is a problem there.

tomshalini · 2021-05-19T09:15:33Z

Hello,

I am using the below command to train the transformers.

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py data/binary/wmt14_en_fr --configs configs/wmt14.en-fr/attention/multibranch_v2/embed496.yml --update-freq 32

tomshalini changed the title ~~not getting good results when train the transformer from scratch.~~ Can not get the result as the paper if train the transformer from scratch. May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not get the result as the paper if train the transformer from scratch. #30

Can not get the result as the paper if train the transformer from scratch. #30

tomshalini commented May 18, 2021

Michaelvll commented May 19, 2021

tomshalini commented May 19, 2021

Can not get the result as the paper if train the transformer from scratch. #30

Can not get the result as the paper if train the transformer from scratch. #30

Comments

tomshalini commented May 18, 2021

Michaelvll commented May 19, 2021

tomshalini commented May 19, 2021