This repository provides the implemention for the paper Distantly-Supervised Long-Tailed Relation Extraction Using Constraint Graphs.
Please cite our paper if our datasets or code is helpful to you ~ 😊
@article{liang2022distantly,
title={Distantly-Supervised Long-Tailed Relation Extraction Using Constraint Graphs},
author={Liang, Tianming and Liu, Yang and Liu, Xiaoyan and Sharma, Gaurav and Guo, Maozu},
journal={IEEE Transactions on Knowledge and Data Engineering},
year={2022},
publisher={IEEE}
}
- Python 3.6+
- Pytorch 1.4.0+
- PyTorch Geometric (see https://github.com/rusty1s/pytorch_geometric for detail)
- Transformers
We provide three processed datasets: NYT-520K, NYT-570K and GDS. Download the datasets and pretrained word embeddings from here, and extract them in data/
.
Vanilla CGRE consists of PCNN and GCN, but we also provide some different choices of backbone models: CNN
, PCNN
and Bert
for sentence encoding, and GCN
, GAT
and SAGE
for graph encoding.
For example, you can try CNN+GAT on NYT-520K by the following command:
python train.py --config 520K_CNN_GAT.yaml
and
python eval.py --config 520K_CNN_GAT.yaml
Please see configuration files in config/
for more options.
PR curves in our paper are stored in Curves/
.
[
{
"text": "he is a son of vera and william lichtenberg of belle_harbor , queens .",
"sub": {"id": "m.0ccvx", "name": "queens", "type": "GPE"},
"obj": {"id": "m.05gf08", "name": "belle_harbor", "type": "GPE"},
"rel": "/location/location/contains"
},
...
]
{
"NA": 0,
"/location/neighborhood/neighborhood_of": 1,
...
}
{
"NONE": 0,
"CARDINAL": 1,
...
}
{
relation_1: [[head_type_1, tail_type_1], [head_type_2, tail_type_2], ...],
...
}