Text Sum Uncertainty

Code for "Understanding Neural Abstractive Summarization Models via Uncertainty" (EMNLP20, short)

ArXiv preprint available at here.

Slide Deck

Contact: jcxu at utexas.edu

In this work,

We analyze summarization decoders by studying on the entropy, or uncertainty, of the model's token-level predictions.
Models examined: PEGASUS(paper, model) and BART(paper,model)
Datasets covered: CNN/DM and XSum
Quick start with models directly from huggingface.co/transformers

With the help of the methods we developed, we further investigate

Correlation between prediction entropy & model behavior like COPY or GEN (Sec. 3)
Sentence position connects to prediction entropy (Sec. 3)
Model behavior in different syntactic environments (Sec. 4)
Coarse properties of attention and the how that correlates with model's prediction (Sec. 5)

In util.py, the function parse_arg defines all of the hyper-params used in this project.

Param	Usage
prob_meta_dir	The location you save the model outputs.
max_len	Max decoding length. Set to 30 for XSum and 80 for CNN/DM.
device	Device name for Pytorch.
nuc_prob	Nucleus sampling prob threshold. Default: 0.95.
trunc_prob	Truncate the probability distribution (by default used in all of our experiments).
full_prob	Use the original probability distribution.

To run the model, simply run python run_model_pegasus.py with one of the following parameter configuration.

Config Name	Parameters
run_model_pegasus_cnndm	--full_data
run_model_pegasus_xsum	--full_data --model_name google/pegasus-xsum --data_name xsum
run_model_bart_cnndm	--full_data --model_name facebook/bart-large-cnn
run_model_bart_xsum	--full_data --model_name facebook/bart-large-xsum --data_name xsum

class SumGen in run_model_pegasus.py is the core decoding part.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.idea		.idea
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze_attention.py		analyze_attention.py
analyze_entropy.py		analyze_entropy.py
analyze_prob_attn.py		analyze_prob_attn.py
attention_layer.py		attention_layer.py
attention_y_entropy.py		attention_y_entropy.py
attn_viz.py		attn_viz.py
configurations.py		configurations.py
convert_old_dump.py		convert_old_dump.py
data_collection.py		data_collection.py
install.sh		install.sh
pegasus_example.py		pegasus_example.py
plot_fig_together.py		plot_fig_together.py
plot_figures.py		plot_figures.py
run_model_pegasus.py		run_model_pegasus.py
slide.pdf		slide.pdf
test.py		test.py
util.py		util.py
viz_prediction.py		viz_prediction.py