Skip to content

Latest commit

 

History

History
22 lines (19 loc) · 972 Bytes

README.md

File metadata and controls

22 lines (19 loc) · 972 Bytes

Conspiracy Theory Memorization

This repository contains the dataset used in our paper: Investigating Memorization of Conspiracy Theories in Text Generation.

When using our dataset, please cite our paper:

@article{conspiracymem,
  author    = {Sharon Levy and
               Michael Saxon and
               William Yang Wang},
  title     = {Investigating Memorization of Conspiracy Theories in Text Generation},
  journal   = {CoRR},
  volume    = {abs/2101.00379},
  year      = {2021},
  url       = {https://arxiv.org/abs/2101.00379},
}

Dataset

This dataset consists of popular conspiracy theory topics from Wikipedia's category list in Wikipedia_topics.txt. Our process in selecting these topics is described in our paper.

The other three files consist of conspiracy theories generated by GPT-2 Large with the text prompt "The conspiracy theory is that". Each file represents the temperature setting used when generating the text (0.4, 0.7, 1).