Skip to content

ogozcelik/MiDe22

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MiDe22

Official data repository of English and Turkish misinformation detection datasets from the LREC-COLING 2024 paper "MiDe22: An Annotated Multi-Event Tweet Dataset for Misinformation Detection".

Screenshot

Dataset

The dataset comprises 10,348 tweets: 5,284 for English and 5,064 for Turkish. Tweets in the dataset cover different topics: the Russia-Ukraine war, the COVID-19 pandemic, Refugees, and additional miscellaneous events. Three misinformation labels of the tweet are also given. Since we follow Twitter's Terms and Conditions, we publish tweet IDs, not the tweet content directly. Explanations of the columns of the file are as follows:

Column Name Description
Topic Topic of the tweet: Ukraine, Covid, Refugees or Misc
Event Event of the tweet: EN01-EN40 in English and TR01-TR40 in Turkish
Label Label of the tweet: True, False, or Other
Tweet_id Twitter ID of the tweet

The distribution of tweet counts in the dataset is as follows:

Lang Topic True False Other Total
EN Ukraine
Covid
Refugees
Misc
Total
320
167
94
146
727
393
514
328
494
1,729
618
663
796
751
2,828
1,331
1,344
1,218
1,391
5,284
TR Ukraine
Covid
Refugees
Misc
Total
129
190
61
289
669
338
558
202
634
1,732
477
816
298
1,072
2,663
944
1,564
561
1,995
5,064

Citation

If you make use of the datasets and codes, please cite the following paper:

@article{will be available soon.}

About

English and Turkish Misinformation Detection Dataset from "MiDe22: An Annotated Multi-Event Tweet Dataset for Misinformation Detection"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages