Rewarding Coreference Resolvers for Being Consistent with World Knowledge

In proceedings of EMNLP 2019.

Datasets

For convinience, create a symlink: cd e2e-coref && ln -s ../wiki ./wiki

For pre-training the coreference resolution system, OntoNotes 5.0 is required. [Download] [Create splits]

Data for training the reward models and fine-tuning the coreference resolver (place in <PROJECT_HOME>/data):

2M triples for RE-Text [Download]
12M triples for RE-KG [Download]
60k triples for RE-Joint [Download]
Development data [Download]
10k wikipedia summaries for fine-tuning [Download]

Note: If you want to make these files from scratch, follow the instructions in the triples folder.

Pre-trained models

Best performing reward model (RE-Distill) [Download]
Best performing coreference resolver (Coref-Distill) [Download]

Evaluation

Unzip Coref-Distill into e2e-coref/logs folder and run GPU=x python evaluate.py final

Training

Reward models

Download pytorch big-graph embeddings (~40G, place in <PROJECT_HOME>/embeddings) [Download]
Run wiki/embs.py to create an index of the embeddings (you need to do this only once)
Run reward module training with cd wiki/reward && python train.py <dataset-name>

Coreference resolver

Pre-training

Follow e2e-coref/README.md to setup environment, create ELMO embeddings, etc.
Run coreference pre-training with cd e2e-coref && GPU=x python train.py <experiment>

Fine-tuning

Start the sling server with python wiki/reward/sling_server.py
Change SLING_IP in wiki/reward/reward.py to the IP of the sling server
Run coreference fine-tuning with cd e2e-coref && GPU=x python finetune.py <experiment> (see e2e-coref/experiments.conf for the different configurations)

Misc

wiki/reward/combine_models.py can be used to distill the various reward models
e2e-coref/save_weights.py can be used to save the weights of the fine-tuned coreference models so that they can be combined by setting the distill flag in the configuration file

Citation

@inproceedings{aralikatte-etal-2019-rewarding,
    title = "Rewarding Coreference Resolvers for Being Consistent with World Knowledge",
    author = "Aralikatte, Rahul  and
      Lent, Heather  and
      Gonzalez, Ana Valeria  and
      Herschcovich, Daniel  and
      Qiu, Chen  and
      Sandholm, Anders  and
      Ringaard, Michael  and
      S{\o}gaard, Anders",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1118",
    doi = "10.18653/v1/D19-1118",
    pages = "1229--1235"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Rewarding Coreference Resolvers for Being Consistent with World Knowledge

Datasets

Pre-trained models

Evaluation

Training

Reward models

Coreference resolver

Pre-training

Fine-tuning

Misc

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Rewarding Coreference Resolvers for Being Consistent with World Knowledge

Datasets

Pre-trained models

Evaluation

Training

Reward models

Coreference resolver

Pre-training

Fine-tuning

Misc

Citation