Improving Generalization in Coreference Resolution via Adversarial Training

This repository contains the code for reproducing the experiments in the paper "Improving Generalization in Coreference Resolution via Adversarial Training" by Sanjay Subramanian and Dan Roth, published at *SEM 2019.

Requirements

This code was tested using Python 2.7 and Ubuntu 16.04. The requirements.txt co ntains the packages and corresponding versions of the Python environment used fo r running this code. Please follow the Getting Started instructions in https://github.com/kentonl/e2e-coref to download necessary files (e.g. word embeddings). You will also need to download the chec kpoint for the Lee et al. 2018 model and insert the corresponding path in the lee2018_log_root field in experiments_adv.conf. git-lfs was used to store the adv_checkpoint.zip file, so you may need git-lfs to clone the repository.

Modify paths

Make sure to set the paths in experiments_adv.conf and replace_data.py to be correct for your system. The allCountries.txt and countryInfo.txt files can be downloaded from geonames.org, and the last_names.txt file contains the last names from the 1990 census, which can be downloaded from https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last#.

Reproducing Paper Results

First, unzip the adv_checkpoint.zip file to yield the adv_checkpoint directory. To reproduce the results in the paper, please run prepare_data.sh and subsequently run run_experiments.sh when the repository is the working directory. Please note that by default the prepare_data.sh script loads the state of the random number generator that we used to generate replacement names to enable exact reproducibility of our results. If you would like generate replacement names at random, you need only comment out the relevant line in generate_noleakage.py. The results should match those in the paper: http://cogcomp.org/papers/SubramanianRo19.pdf .

Acknowledgements

Much of the code in this repository is from Kenton Lee's repository https://github.com/kentonl/e2e-coref or is adapted from code in that repository. That code was distributed under an Apache 2.0 license. The firstname-gender-score.txt gazetteer was provided by Sihao Chen.

Citation

If you use this work in your research, please cite our paper:

@inproceedings{SubramanianRo19,
    author = {Sanjay Subramanian and Dan Roth},
    title = {{Improving Generalization in Coreference Resolution via Adversarial Training}},
    booktitle = {Proc. of the Joint Conference on Lexical and Computational Sematics},
    month = {6},
    year = {2019},
    url = "http://cogcomp.org/papers/SubramanianRo19.pdf",
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
gap_coreference @ 83135f2		gap_coreference @ 83135f2
reference-coreference-scorers @ f49f3f8		reference-coreference-scorers @ f49f3f8
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
adv_checkpoint.zip		adv_checkpoint.zip
cache_elmo.py		cache_elmo.py
compute_conll_f1.py		compute_conll_f1.py
conll.py		conll.py
continuous_evaluate.py		continuous_evaluate.py
coref_kernels.cc		coref_kernels.cc
coref_model_adv.py		coref_model_adv.py
coref_model_original.py		coref_model_original.py
coref_ops.py		coref_ops.py
demo.py		demo.py
evaluate.py		evaluate.py
evaluate_conll.py		evaluate_conll.py
evaluate_gap.py		evaluate_gap.py
experiments_adv.conf		experiments_adv.conf
filter_embeddings.py		filter_embeddings.py
find_mention_heads.py		find_mention_heads.py
firstname-gender-score.txt		firstname-gender-score.txt
generate_noleakage.py		generate_noleakage.py
json_to_conll.py		json_to_conll.py
merge_ner_types.py		merge_ner_types.py
metrics.py		metrics.py
minimize.py		minimize.py
minimize_with_ner.py		minimize_with_ner.py
prepare_data.sh		prepare_data.sh
random.pkl		random.pkl
replace_data.py		replace_data.py
requirements.txt		requirements.txt
run_experiments.sh		run_experiments.sh
setup_all.sh		setup_all.sh
setup_pretrained.sh		setup_pretrained.sh
setup_training.sh		setup_training.sh
significance.py		significance.py
significance_gap.py		significance_gap.py
train.py		train.py
util.py		util.py

License

sanjayss34/adv_coref

Folders and files

Latest commit

History

Repository files navigation

Improving Generalization in Coreference Resolution via Adversarial Training

Requirements

Modify paths

Reproducing Paper Results

Acknowledgements

Citation

About

Resources

License

Stars

Watchers

Forks

Languages