Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems

We annotated a dialogue data set, User Satisfaction Simulation (USS), that includes 6,800 dialogues. All user utterances in those dialogues, as well as the dialogues themselves, have been labeled based on a 5-level satisfaction scale. See dataset.

These resources are developed within the following paper:

Weiwei Sun, Shuo Zhang, Krisztian Balog, Zhaochun Ren, Pengjie Ren, Zhumin Chen, Maarten de Rijke. "Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems". In SIGIR. Paper link

Data

The dataset (see dataset) is provided a TXT format, where each line is separated by "\t":

speaker role (USER or SYSTEM),
text,
action,
satisfaction (repeated annotation are separated by ","),
explanation text (only for JDDC at dialogue level, and repeated annotation are separated by ";")

And sessions are separated by blank lines.

Since the original dataset does not provide actions, we use the action annotation provided by IARD and included it in ReDial-action.txt.

The JDDC data set provides the action of each user utterances, including 234 categories. We compress them into 12 categories based on a manually defined classification method (see JDDC-ActionList.txt).

Data Statistics

The USS dataset is based on five benchmark task-oriented dialogue datasets: JDDC, Schema Guided Dialogue (SGD), MultiWOZ 2.1, Recommendation Dialogues (ReDial), and Coached Conversational Preference Elicitation (CCPE).

Domain	JDDC	SGD	MultiWOZ	ReDial	CCPE
Language	Chinese	English	English	English	English
#Dialogues	3,300	1,000	1,000	1,000	500
Avg# Turns	32.3	26.7	23.1	22.5	24.9
#Utterances	54,517	13,833	12,553	11,806	6,860
Rating 1	120	5	12	20	10
Rating 2	4,820	769	725	720	1,472
Rating 3	45,005	11,515	11,141	9,623	5,315
Rating 4	4,151	1,494	669	1,490	59
Rating 5	421	50	6	34	4

Baselines

The code for baseline reproduction can be found within /baselines.

Cite

@inproceedings{Sun:2021:SUS,
  author =    {Sun, Weiwei and Zhang, Shuo and Balog, Krisztian and Ren, Zhaochun and Ren, Pengjie and Chen, Zhumin and de Rijke, Maarten},
  title =     {Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems},
  booktitle = {Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  series =    {SIGIR '21},
  year =      {2021},
  publisher = {ACM}
}

Contact

If you have any questions, please contact sunnweiwei@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.idea		.idea
baselines		baselines
dataset		dataset
imgs		imgs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

baselines

baselines

dataset

dataset

imgs

imgs

README.md

README.md

Repository files navigation

Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems

Data

Data Statistics

Baselines

Cite

Contact

About

Releases

Packages

Languages

sunnweiwei/user-satisfaction-simulation

Folders and files

Latest commit

History

Repository files navigation

Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems

Data

Data Statistics

Baselines

Cite

Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages