ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

Implementation of our paper pubslished in EMNLP 2022

Brief Introduction

Framework of consistency-based transfer learning.

Transfer learning is a simple and powerful method to boost the model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer the knowledge from a parent model to a child model once and for all via parameter initialization. In this paper, we instead propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer parent knowledge during the whole training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model, and encourages the prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model.

Preparation 1: Install fairseq

cd ConsisTL
pip install --editable .
cd ..
# python>=3.7
# We don't need to install pytorch individually.

Preparation 2: Dowload and binarize data

# download and preprocess student data
mkdir tr_en
cd tr_en
# donwload tr-en from https://drive.google.com/file/d/1B23gkfQ3O430KSGVRCqTLyjPO01A5e6L/view?usp=sharing
# raw tr-en can be downloaded from https://opus.nlpl.eu/download.php?f=SETIMES/v2/moses/en-tr.txt.zip
cd ..
fairseq-preprocess -s tr -t en --trainpref tr_en/pack_clean/train --validpref tr_en/pack_clean/valid --testpref tr_en/pack_clean/test --srcdict tr_en/dict.tr.txt --tgtdict dict.en.txt --workers 10 --destdir ${STUDENT_DATA}
# download and preprocess teacher data
mkdir de_en
cd de_en
#donwload de-en from https://drive.google.com/file/d/15CXWVj0NIMjDjxEfPCw2WktoYADUuX8O/view?usp=sharing
cd ..
fairseq-preprocess -s de -t en --trainpref de_en/pack_clean/train --validpref de_en/pack_clean/valid --testpref de_en/pack_clean/test --joined-dictionary --destdir ${TEACHER_DATA} --workers 10

Step 1: Train two parent models

cd full_process_scripts

# train two parent model
## train for en-de
### path of binarized parent model training data
BIN_TEACHER_DATA=${BIN_TEACHER_DATA}
bash train_parent.sh en de $BIN_TEACHER_DATA
## train for de-en
bash train_parent.sh de en $BIN_TEACHER_DATA

Step 2: Generate semantically-equivalent sentences

#gen synthetic de-en for tr-en
## English sentences in child data
CHILD_EN=${CHILD_EN}
## path of trained reversed teacher checkpoint
REVERSED_TEACHER_CHECKPOINT=${REVERSED_TEACHER_CHECKPOINT}
## auxiliary source
AUX_SRC_BIN=${AUX_SRC_BIN}
bash gen.sh $CHILD_EN $BIN_TEACHER_DATA $REVERSED_TEACHER_CHECKPOINT $AUX_SRC_BIN

Step 3: Exploit Token Matching (TM) for initialization

#switch checkpoint
## path of initialized checkpoint
INIT_CHECKPOINT=${INIT_CHECKPOINT}
## path of student data
BIN_STUDENT_DATA=${BIN_STUDENT_DATA}
## path of teacher checkpoint
python ../ConsisTL/preprocessing_scripts/TM.py --checkpoint $TEACHER_CHECKPOINT --output $INIT_CHECKPOINT --parent-dict $BIN_TEACHER_DATA/dict.de.txt --child-dict $BIN_STUDENT_DATA/dict.tr.txt --switch-dict src

Step 4: Train Child Model (s)

# train for TM-TL
bash train.sh $STUDENT_DATA $INIT_CHECKPOINT

# train for ConsisTL
bash ConsisTL.sh $PREFIX-bin $TEACHER_CHECKPOINT $TEACHER_DATA $STUDENT_DATA $INIT_CHECKPOINT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
ConsisTL		ConsisTL
full_process_scripts		full_process_scripts
image		image
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConsisTL

ConsisTL

full_process_scripts

full_process_scripts

image

image

README.md

README.md

Repository files navigation

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

Brief Introduction

Preparation 1: Install fairseq

Preparation 2: Dowload and binarize data

Step 1: Train two parent models

Step 2: Generate semantically-equivalent sentences

Step 3: Exploit Token Matching (TM) for initialization

Step 4: Train Child Model (s)

About

Releases

Packages

Languages

NLP2CT/ConsistTL

Folders and files

Latest commit

History

Repository files navigation

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

Brief Introduction

Preparation 1: Install fairseq

Preparation 2: Dowload and binarize data

Step 1: Train two parent models

Step 2: Generate semantically-equivalent sentences

Step 3: Exploit Token Matching (TM) for initialization

Step 4: Train Child Model (s)

About

Resources

Stars

Watchers

Forks

Languages