Skip to content

TalSchuster/TokenMasker

Repository files navigation

TokenMasker

This repository contains the code for the masker module from the paper Automatic Fact-guided Sentence Modification (AAAI 2020)

The multiple-encoder pointer-generator is available here.

Description

The goal of the masker is to find the minimal group of tokens can be removed from a sentence in order to modify the relation of it with another setentence. For example, given a pair of claim and evidence sentences, it finds the words to delete from the evidence that will make it neutral with respect to the claim. The neutrality is determined by a pretrained classifier.

For example ($ symbols a masked token):

  • Claim: Eddie Vedder sings.
  • Evidence: He is known for his powerful baritone vocals.
  • Model's output: He is known for $ powerful $ $.

Illustration of the model:

mask gen

Setup

You'll need the allennlp repo (version 0.8.3)

pip install -r requirements.txt

Training

Neutrality classifier

Note - To train a masker with our neutrality pretrained classifier, skip to the next step (the config file has the path to our trained model).

allennlp train allen_configs/esim_fever_wmask.jsonnnet -s trained_neutrality_classsifier --include-package masker_allen_pkg

Masker

allennlp train allen_configs/mask_generator.jsonnet -s trained_mask_generator --include-package masker_allen_pkg

Extracting masks

Trained model

To get the trained masked model and preprocessed FEVER training data:

wget https://www.dropbox.com/s/do5jptwmgroencn/model.tar.gz
wget https://www.dropbox.com/s/o53i6urucny7q03/fever.train_no_nei.tokenized.jsonl

Command

To create masks for the data (add -c to use gpu):

python model_predictions.py -f model.tar.gz \
-i fever.train_no_nei.tokenized.jsonl \
-out predictions/fever_train_no_nei.jsonl

Citation

If you find this repository helpful, please cite our paper:

@inproceedings{shah2020automatic,
  title={Automatic Fact-guided Sentence Modification},
  author={Darsh J Shah and Tal Schuster and Regina Barzilay},
  booktitle={Association for the Advancement of Artificial Intelligence ({AAAI})},
  year={2020},
  url={https://arxiv.org/pdf/1909.13838.pdf}
}

About

Masking tokens to modify the predictions of a pretrained sentence classifier

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published