Data Augmentation for Abstractive Query-Focused Multi-Document Summarization (AAAI 2021)

This is the implementation of the paper Data Augmentation for Abstractive Query-Focused Multi-Document Summarization.

Prerequisites

Python 3.6+
[PyTorch 1.0] (http://pytorch.org/)
Install all the required packages from requirements.txt file.
```
pip install -r requirements.txt
```
Download the processed datasets (in pytorch format) and setup some folders and repos by running the following command:
```
python setup.py
```
if you face any issues in downloading the datasets with the above code (setup.py), directly download the datasets from here: wikisum, wikisum-query, qmds-cnn, qmds-cnn-query. Run the above code with the following argument to setup everything else except the datasets.
```
python setup.py --ignore_datasets
```
If you face any issues with running ROUGE evaluation, checkout this link.
Some codes are borrowed from: hiersumm and ONMT.

Usage

To train the model:

DATASET=[CNNDM/WIKI] MODEL_TYPE=[hier/he/order/query/heq/heo/hero] bash run_experiments.sh

To test the model:

DATASET=[CNNDM/WIKI] MODEL_TYPE=[hier/he/order/query/heq/heo/hero] bash test.sh

Few points to note:

Various model types (MODEL_TYPE):
- hier: Baseline model (Hierarchical Transformers)
- he: HS w/ Hierarchical Encodings
- order: HS w/ Ordering Component
- query: HS w/ Query Encoding
- heq: HS-Joint Model (Hierachical Encodings + Query Encoding)
- heo: HS-Joint Model (Hierachical Encodings + Ordering Component)
- hero: HS-Joint Model (all three components combined)
We tested our models on Nvidia P-100s 16GB. Each experiments uses 4 GPUs. If you have fewer gpus or memory, set BATCH_SIZE, VISIBLE_GPUS, ACCUM_COUNT accordingly.
data, vocab, and model paths are set to default locations. Set these variables if you want to use different paths.

Reference

If you find this code helpful, please consider citing the following paper:

@inproceedings{pasunuru2021data,
    title={Data Augmentation for Abstractive Query-Focused Multi-Document Summarization},
    author={Pasunuru, Ramakanth and Celikyilmaz, Asli and Galley, Michel and Xiong, Chenyan and Zhang, Yizhe and Bansal, Mohit and Gao, Jianfeng},
    booktitle={AAAI},
    year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run_experiments.sh		run_experiments.sh
setup.py		setup.py
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

run_experiments.sh

run_experiments.sh

setup.py

setup.py

test.sh

test.sh

Repository files navigation

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization (AAAI 2021)

Prerequisites

Usage

Reference

About

Releases

Packages

Languages

ramakanth-pasunuru/QmdsCnnIr

Folders and files

Latest commit

History

Repository files navigation

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization (AAAI 2021)

Prerequisites

Usage

Reference

About

Resources

Stars

Watchers

Forks

Languages