Skip to content

akthesis/speaker2credit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Speaker2Credit

Credibility vectors for fake news detection

This dataset is part of a paper published at DSJM 2018 and an ongoing master thesis.

Speaker2Credit is based on the publicly available data from PolitiFact.com and the benchmark data set LIAR (Wang 2017). Given a speaker's name, job title, party affiliation and home state one can look up their corresponding credibility vector.

Content

The dataset consists of 10 tab-separated columns

4 columns to identify the speaker:

  • speaker (lowercase, hyphenated full name of the speaker)
  • speakers_job (official job title)
  • state_info (home state of speaker)
  • party_affiliation

6 alphabetically sorted columns that make up the credit vector:

  • barely_true_cnt
  • false_cnt
  • half_true_cnt
  • mostly_true_cnt
  • pants_on_fire_cnt
  • true_cnt

References

If you use this dataset, please cite the following paper:

@InProceedings{kirilin2018exploiting,
 author={Kirilin, Angelika  and  Strube, Micheal},
 title={Exploiting a Speaker’s Credibility to Detect Fake News},
 booktitle={Proceedings of Data Science, Journalism \& Media workshop at KDD (DSJM’18)},
 year=2018
}

Releases

No releases published

Packages

No packages published