Semi-Supervised Bilingual Lexicon Induction with Two-Way Message Passing Mechanisms

In this repository, We present the implementation of our two poposed semi-supervised approches CSS and PSS for BLI.

Dependencies

python 3.7
Pytorch
Numpy
Faiss

How to get the datasets

You need to download the MUSE dataset from here to the ./muse_data directory.

You need to download the VecMap dataset from here to the ./vecmap_data directory.

How to run

You can run the following command to evaluate CSS on the MUSE dataset with "5k all" annotated lexicon:

python main.py --config_file ./configs/config-CSS-muse-en-es-5kall.yaml

You can run the following command to evaluate PSS on the VecMap dataset with "5k all" annotated lexicon:

python main.py --config_file ./configs/config-PSS-vecmap-en-es-5kall.yaml

Configuration

Then we briefly discribe some important fields in the configuration file:

"method"" specifies the model to evaludate. "CSSBli" for CSS or "PSSBli" for PSS.
"src" and "tgt" indicate the source and target languages of BLI task.
"data_params/data_dir" specifies which dataset to use where "./muse_data/" for MUSE or "./vecmap_data/" for VevMap.
"supervised/max_count" indicates the size of annotated lexicon where "-1" for "5k all", "100" for "100 unique" and "5000" for "5000 unique".

Other fields specify the hyperparameters for CSS and PSS.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
IO		IO
configs		configs
evaluation		evaluation
model		model
sinkhorn		sinkhorn
README.md		README.md
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IO

IO

configs

configs

evaluation

evaluation

model

model

sinkhorn

sinkhorn

README.md

README.md

main.py

main.py

utils.py

utils.py

Repository files navigation

Semi-Supervised Bilingual Lexicon Induction with Two-Way Message Passing Mechanisms

Dependencies

How to get the datasets

How to run

Configuration

About

Releases

Packages

Languages

BestActionNow/SemiSupBLI

Folders and files

Latest commit

History

Repository files navigation

Semi-Supervised Bilingual Lexicon Induction with Two-Way Message Passing Mechanisms

Dependencies

How to get the datasets

How to run

Configuration

About

Resources

Stars

Watchers

Forks

Languages