Mixed-Gradient-Few-Shot

This is the repository of the Findings of NAACL 2022 paper "Por Qué Não Utiliser Alla Språk? Mixed Training with Gradient Optimization in Few-Shot Cross-Lingual Transfer".

We modified the code based on the Xtreme benchmark.

Model Card

We show an example of how we code the stochastic gradient surgery function at sgs_card.py for easier locating the implementation of the method.

Prerequisites

The first step is to build the virtual environment.

conda create --name xtreme --file conda-env.txt
conda activate xtreme
bash install_tools.sh

Download the Data

There are two ways to download the data. The first way is easier: directly downloading our zipped data from google drive:

gdown https://drive.google.com/uc?id=1uA6QHGQ9iBqhYe45A54EPBUwsFTmHxec
unzip download.zip
rm download.zip

All data will be located at the folder download/

The second is following the command to download the data:

bash scripts/download_data.sh

Training

To conduct few-shot learning on various tasks, we run:

bash scripts/train.sh [MODEL] [TASK] [SEED] [FEW_SHOT] [GPU]

The model and results will be stored at outputs/seed-${SEED}/"}

For example, we run a 5-shot learning by fine-tuning xlm-roberta-large on the NER task with random seed 1, which is trained on the GPU 0:

bash scripts/train.sh xlm-roberta-large panx 1 5 0

Note that our codebase only support stochastic gradient surgery for 4 tasks presented in the paper, i.e. panx, udpos, xnli, tydiqa. The 5 seeds used in the paper are 1,2,3,4,5.

Results

You can find your results on the test sets for all target languages in the test_results.txt file. The reader can find them under outputs folder. For example, they are located at:

for panx: outputs/seed-1/panx/xlm-roberta-large-LR2e-5-epoch10-MaxLen128/test_results.txt
for udpos: outputs/seed-1/udpos/xlm-roberta-large-LR2e-5-epoch10-MaxLen128/test_results.txt
for tydiqa: outputs/seed-1/xlm-roberta-base_LR3e-5_EPOCH30_maxlen384_batchsize4_gradacc8/predictions/test_results.txt
for xnli: outputs/seed-1/xlm-roberta-base-LR2e-5-epoch5-MaxLen128/test_results.txt

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
mock_test_data		mock_test_data
scripts		scripts
third_party		third_party
CONTRIBUTING		CONTRIBUTING
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
conda-env.txt		conda-env.txt
evaluate.py		evaluate.py
evaluate_test.py		evaluate_test.py
install_tools.sh		install_tools.sh
sgs_card.py		sgs_card.py
utils_preprocess.py		utils_preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mock_test_data

mock_test_data

scripts

scripts

third_party

third_party

CONTRIBUTING

CONTRIBUTING

LICENSE

LICENSE

README.md

README.md

init.py

init.py

conda-env.txt

conda-env.txt

evaluate.py

evaluate.py

evaluate_test.py

evaluate_test.py

install_tools.sh

install_tools.sh

sgs_card.py

sgs_card.py

utils_preprocess.py

utils_preprocess.py

Repository files navigation

Mixed-Gradient-Few-Shot

Model Card

Prerequisites

Download the Data

Training

Results

About

Releases

Packages

Languages

License

fe1ixxu/Mixed-Gradient-Few-Shot

Folders and files

Latest commit

History

Repository files navigation

Mixed-Gradient-Few-Shot

Model Card

Prerequisites

Download the Data

Training

Results

About

Resources

License

Stars

Watchers

Forks

Languages