GitHub - dair-iitd/DSRE: Resources for the paper "PARE: A Simple and Strong Baseline for Monolingual and Multilingual Distantly Supervised Relation Extraction"

Official Code for "PARE: A Simple and Strong Baseline for Monolingual and MultilingualDistantly Supervised Relation Extraction"

Please follow the following steps (one by one) to reproduce the results presented in our pre-print:

1. Environment Setup

Our codebase is tested on Python 3.6.13. We recommend creating a conda environment using the command given below.

conda create --name your_env_name python=3.6.13

Our codebase is tested on GPUs with cuda version >= 10.2. Please install all of the dependencies using the command given below in the topmost directory (which contains the requirements.txt file)

pip install -r requirements.txt

2. Downloading Datasets

We present results on four open-source datasets: NYT-10d, NYT-10m, Wiki-20m and DiS-ReX. To reproduce results on each of these datasets, we provide scripts in the "benchmark" folder to download them.
For downloading NYT-10d, use the following command inside the benchmark folder

sh download_nyt10.sh

For downloading NYT-10m, use the following command inside the benchmark folder

sh download_nyt10m.sh

For downloading Wiki-20m, use the following command inside the benchmark folder

sh download_wiki20m.sh

For downloading DiS-ReX, use the following command inside the benchmark folder

sh download_disrex.sh

3. Training and testing models

Training scripts are provided in the topmost directory for each of the four datasets. Once the training finishes, the best saved model would automatically be tested on the test set (returning AUC, Macro F1, Micro F1, and P@M)
To reproduce results on NYT-10d, run

sh train_nyt10d.sh

To reproduce results on NYT-10m, run

sh train_nyt10m.sh

To reproduce results on Wiki-20m, run

sh train_wiki20m.sh

To reproduce results on DiS-ReX, run

sh train_disrex.sh

@inproceedings{rathore2022pare,
  title={PARE: A Simple and Strong Baseline for Monolingual and Multilingual Distantly Supervised Relation Extraction},
  author={Rathore, Vipul and Badola, Kartikeya and Singla, Parag and others},
  booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
  pages={340--354},
  year={2022}
}

Acknowledgements

Our codebase is built upon OpenNRE's. For more details on the format of the dataset's used, we refer the user to their repository.

For more details on the DiS-ReX dataset, we refer the user to their pre-print as well as their repository.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
benchmark		benchmark
encoder		encoder
figure		figure
framework		framework
model1		model1
model2		model2
README.md		README.md
infer_disrex.sh		infer_disrex.sh
infer_nyt10d.sh		infer_nyt10d.sh
infer_nyt10m.sh		infer_nyt10m.sh
infer_wiki20m.sh		infer_wiki20m.sh
main.py		main.py
requirements.txt		requirements.txt
train_disrex.sh		train_disrex.sh
train_nyt10d.sh		train_nyt10d.sh
train_nyt10m.sh		train_nyt10m.sh
train_wiki20m.sh		train_wiki20m.sh

dair-iitd/DSRE

Folders and files

Latest commit

History

Repository files navigation

Official Code for "PARE: A Simple and Strong Baseline for Monolingual and MultilingualDistantly Supervised Relation Extraction"

1. Environment Setup

2. Downloading Datasets

3. Training and testing models

4. Trained model checkpoint

5. P-R Curves

Cite

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages