Human Guided Exploitation of Attention Patterns

This repository contains the code base for running the experiments in our paper "Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation (2022)" presented at EMNLP 2022.

Human-in-the-loop Pipeline

To run the human-in-the-loop pattern extraction pipeline described in our paper, please checkout the Github repository for our demo paper "T3-Vis: visual analytic for Training and fine-Tuning Transformers in NLP (2021)".

Data Preprocessing

We adopt the code base from the original papers to preprocess the datasets.

Please refer to the https://github.com/nlpyang/PreSumm for summarization, and https://github.com/koomri/text-segmentation for topic segmentation.

Training

As an example, to train the extractive summarization model with Transformer encoder and BERT embeddings, where the patterns are injected in all (12/12) heads on CNN/Daily Mail (last row of Table 1):

python run_summarization.py -device 0  -encoder_type transformer -pretrained_embedding bert \
                            -dataset CNNDM -config_file ./models/configs/bert_config_6-12.json \
                            -result_dir ./results/cnndm_transformer_6-12_bert-embeddings_patterns_3/ \
                            -batch_size 18 -top_n_ckpts 3 -valid_metric rouge -save_checkpoint_steps 1000 \
                            -log_file log.txt -top_n 3  -heads_per_pattern 3 -train_steps 100000 \
                            -match_token_attn mask -inter_sent_attn mask -positional_attn synthesize

Similarly, to train the topic segmentation model with the same configuration on WikiSection:

python run_segmentation.py -device 0 -encoder transformer -pretrained_embedding bert-base-uncased \
                           -dataset wiki_section -config_file models/configs/bert_config_6-12.json \
		       -ckpt_dir checkpoints/cross_seg_wikisection_transformer_6-12_bert-embeddings_patterns_3 \
                           -results_dir results/cross_seg_wikisection_transformer_6-12_bert-embeddings_patterns_3 \
		       -num_workers 2 -num_epochs 5 -use_patterns -batch_size 32 -heads_per_pattern 3 -grad_accum_steps 2

Evaluation

To evaluate the trained model, you can run the same commands with the additional argument -mode test.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
models		models
preprocessing		preprocessing
rouge_wrapper		rouge_wrapper
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
preprocess.py		preprocess.py
run_segmentation.py		run_segmentation.py
run_summarization.py		run_summarization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

preprocessing

preprocessing

rouge_wrapper

rouge_wrapper

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

preprocess.py

preprocess.py

run_segmentation.py

run_segmentation.py

run_summarization.py

run_summarization.py

Repository files navigation

Human Guided Exploitation of Attention Patterns

Human-in-the-loop Pipeline

Data Preprocessing

Training

Evaluation

About

Releases

Packages

Languages

License

raymondzmc/Attention-Pattern-Exploitation

Folders and files

Latest commit

History

Repository files navigation

Human Guided Exploitation of Attention Patterns

Human-in-the-loop Pipeline

Data Preprocessing

Training

Evaluation

About

Resources

License

Stars

Watchers

Forks

Languages