The data used for the expirements in "Facts2Story: Controlling Text Generation by Key Facts" Presented in Coling2020 https://www.aclweb.org/anthology/2020.coling-main.211/
Wikipedia Movie Plots - scraped from Wikipedia -
largley based on https://www.kaggle.com/jrobischon/wikipedia-movie-plots
blind/blined_proofed data sets contain movies that were released after june 2019 and were scraped independently.
5 top salient facts are taken from the artifacts produced when applying the SalIE framework https://github.com/mponza/SalIE Marco Ponza, Luciano Del Corro, Gerhard Waikum Facts That Matter In Proceedings of the 2018 Conference of Empirical Methods in Natural Language Processing (EMNLP 2018)
For further details please refer to the original paper