Exploiting the Global WordNet Graph in Neural Word Sense Disambiguation by Integrating Personalized PageRank

This repo hosts the code necessary to reproduce the results of our EMNLP 2021 paper, Exploiting the Global WordNet Graph in Neural Word Sense Disambiguation by Integrating Personalized PageRank, by Ahmed ElSheikh, Michele Bevilacqua and Roberto Navigli, which you can read on EMNLP Anthology.

This repository relies on the Simone's CODE.

How to Cite

@inproceedings{elsheikh-bevilacqua-navigli-2020-breaking,
    title = "Exploiting the Global WordNet Graph in Neural Word Sense Disambiguation by Integrating Personalized PageRank",
    author = "ElSheikh, Ahmed and Bevilacqua, Michele  and Navigli, Roberto",
    year = "2021",
    address = "Online",
    publisher = "Emperical Method for Natural Language Processing",
}

Abstract

Neural Word Sense Disambiguation (WSD) has recently been shown to benefit from the incorporation of pre-existing knowledge, such as that coming from the WordNet graph. However, state-of-the-art approaches have been successful in exploiting only the local structure of the graph, with only close neighbors of a given synset influencing the prediction. In this work, we improve a classification model by recomputing logits as a function of both the vanilla independently produced logits and the global WordNet graph. We achieve this by incorporating an online neural approximated PageRank, which enables us to refine edge weights as well. This method allows us to exploit the global graph structure while keeping space requirements linear in the number of edges. We obtain strong improvements, matching the current state of the art

Installation

make sure to have miniconda installed. if not, install it

It is recommended to create a fresh conda env to use the repo

- conda create -n ewiser_ext python=3.6.9 pip
- conda activate ewiser_ext
- git clone github.com/elsheikh21/nlp_thesis.git
- pip install -r requirements.txt
- pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
- pip install torch-sparse torch-scatter -f https://pytorch-geometric.com/whl/torch-1.5.0+cu101.html

if it needs APEX to be installed

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Externally downloadable resources

EWISER English checkpoints

SemCor
SemCor + untagged glosses
SemCor + tagged glosses + WordNet Examples

Datasets

WSD Evaluation Framework: contains the SemCor training corpus, along with the evaluation datasets from Senseval and SemEval.

Sense Embeddings

Pre-preprocessed SensEmBERT+LMMS embeddings is needed to train your model:

SensEmBERT + LMMS Embeddings

Evaluate

vim the predict_eval_script.sh to add your <ckpt_dir>/<best_mdl_path>

then run the following

cd wsd_thesis
sh predict_eval_script.sh # OR
nohup sh predict_eval_script.sh > eval_script.out  # to log the results

Train

All flags related to training & model params can be found in train.py & wsd/models/model.py
Run the following script
```
cd yat_thesis
sh train.sh
```

or to make it run in background

nohup sh train.sh > experiment_name.out &

License

This project is released under the CC-BY-NC 4.0 license (see LICENSE.txt). If you use EWISER, please put a link to this repo.

Acknowledgements

The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union's Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under the grant "Dipartimenti di eccellenza 2018-2022" of the Department of Computer Science of the Sapienza University of Rome.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
data		data
scripts		scripts
wsd		wsd
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
evaluate.py		evaluate.py
graph_info.py		graph_info.py
mcnemar_test.py		mcnemar_test.py
predict.py		predict.py
predict_eval_script.sh		predict_eval_script.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh

License

elsheikh21/neural-appnp-wsd

Folders and files

Latest commit

History

Repository files navigation

Exploiting the Global WordNet Graph in Neural Word Sense Disambiguation by Integrating Personalized PageRank

How to Cite

Abstract

Installation

Externally downloadable resources

EWISER English checkpoints

Datasets

Sense Embeddings

Evaluate

Train

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages