Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues (NLP4IF, EMNLP-IJCNLP 2019)

UPDATE (11/23/2019): Coh-Metrix-based classification

By doing some preprocessing on input satire and fake news articles, adding some missing paragraphs, in particular, we could achieve a better performance for Coh-Metrix-based classification. In the following, please find the results of our best classifier trained and tested in a 10-fold cross validation setup:

Logistic Regression	Mean on test sets
Precision	0.7314872063519257
Recall	0.7628654970760234
F1	0.771716230451341

Description of files and classes:

classify_satire_fake.py: this code implements a Multinational Naive Bayes text classifier, as described in the paper of Golbeck et al. 2018 (Fake news vs satire: A dataset and analysis).
coh_metrix_experiments.ipynb: PCA analysis in R on the features generated by Coh-Metrix

Files in data folder:

In all of the following file, 0 and 1 are the labels for fake and satire articles, respectively.

data/satirefake_full.xlsx: this is the file including all the indexes from Coh-Metrix. This file is our input in all of our experiments in R.
data/classification.csv: this file includes all the significant components from our regression analysis in R. We use this file as our input for the binary classification task.

Citation Information

If you found our work or any insight we report interesting, please use the following information to cite our paper:

@inproceedings{levi-etal-2019-identifying,
    title = "Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues",
    author = "Levi, Or and Hosseini, Pedram and Diab, Mona and Broniatowski, David",
    booktitle = "Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-5004",
    pages = "31--35",
}

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
data		data
README.md		README.md
classify_satire_fake.py		classify_satire_fake.py
classify_satire_fake_with_bert.py		classify_satire_fake_with_bert.py
coh_metrix_experiments.ipynb		coh_metrix_experiments.ipynb
my_utils.py		my_utils.py
pre_processing.py		pre_processing.py
satire_fake_cohmetrix.py		satire_fake_cohmetrix.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

README.md

classify_satire_fake.py

classify_satire_fake.py

classify_satire_fake_with_bert.py

classify_satire_fake_with_bert.py

coh_metrix_experiments.ipynb

coh_metrix_experiments.ipynb

my_utils.py

my_utils.py

pre_processing.py

pre_processing.py

satire_fake_cohmetrix.py

satire_fake_cohmetrix.py

Repository files navigation

Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues (NLP4IF, EMNLP-IJCNLP 2019)

UPDATE (11/23/2019): Coh-Metrix-based classification

Description of files and classes:

Files in data folder:

Citation Information

About

Releases

Packages

Contributors 3

Languages

adverifai/Satire_vs_Fake

Folders and files

Latest commit

History

Repository files navigation

Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues (NLP4IF, EMNLP-IJCNLP 2019)

UPDATE (11/23/2019): Coh-Metrix-based classification

Description of files and classes:

Files in data folder:

Citation Information

About

Resources

Stars

Watchers

Forks

Languages