On the Role of Images for Analyzing Claims in Social Media

The repository for the source code and the dataset used in the paper

Gullal S. Cheema, Sherzod Hakimov, Eric Müller-Budack and Ralph Ewerth “On the Role of Images for Analyzing Claims in Social Media“, Proceedings of the 2nd International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 30th The Web Conference (WWW 2021).

The paper is available here: http://ceur-ws.org/Vol-2829/paper3.pdf

Extended dataset can be downloaded from here: https://zenodo.org/record/4592249

Setup

For SVM Training and BERT Fine-tuning

Cuda 10.2
conda env create -f environment.yml python=3.6.12
To install ThunderSVM on linux system,
pip install wheel https://github.com/Xtra-Computing/thundersvm/releases/download/v0.3.4/thundersvm_cuda10.1-0.3.4-cp36-cp36m-linux_x86_64.whl
Alternatively, ThunderSVM release for windows can be found here.

For VilBert Extraction and Fine-tuning

Install dependencies and vilbert in a different environment by following instructions here.
Download vilbert pretrained models either from here or here in folder named vilbert-multi-task/data/pretrained/.
Download detectron model in vilbert-multi-task/data/.
wget https://dl.fbaipublicfiles.com/vilbert-multi-task/detectron_model.pth
wget https://dl.fbaipublicfiles.com/vilbert-multi-task/detectron_config.yaml

Data and Feature Extraction

Download data from the zenodo repository
Extract each zip file in data/
Download pretrained places and hybrid models.
Extract Textual Features,
- English : python feature_extraction/extract_bert_en.py -d clef_en -m bertbase -p clean
- Arabic : python feature_extraction/extract_bert_ar.py -m arabert
Extract Visual Features
- Visual Sentiment : python feature_extraction/extract_sent_feats.py -d clef_en -m vgg19_finetuned_all -b 32
- Visual Scene : python feature_extraction/extract_visual_feats.py -v resnet152 -t imgnet -d clef_en
For VilBert
- Extract bottom-up faster-RCNN features from images:
  - python vilbert-multi-task/script/extract_features.py --model_file vilbert-multi-task/data/detectron_model.pth --config_file vilbert-multi-task/data/detectron_config.yaml --image_dir data/lesa/images/ --output_folder data/lesa/rcnn_feats/
  - python vilbert-multi-task/script/convert_to_lmdb.py --features_dir data/lesa/rcnn_feats/ --lmdb_file data/lesa/rcnn_lmdbs/
- Some images have no detectable objects. For those images, we take random crops and extract ResNet-152 last layer features. python vilbert_code/extract_missing_feats.py --dset lesa
- Extract VilBert features: python vilbert_code/extract_features.py --model multi_task --dset lesa
Use -h to see options

SVM Training and Evaluation

Text-based,
python svm_training/svm_textfeats.py --tfeat sumavg --tmodel bertbase --ttype clean --dset lesa --split 0
Image-based,
python svm_training/svm_imgfeats.py --vfeat feats --vmodel resnet152 --vtype imgnet --dset lesa --split 0
Image and Text based,
python svm_training/svm_imgText.py --tfeat sumavg --tmodel bertbase --ttype clean --vfeat feats --vmodel resnet152 --vtype imgnet --split 0
VilBert based, python svm_training/svm_vilbertfeats.py --normalize 1 --feat pooled --model multi_task --dset lesa

BERT Fine-tuning

For ClEF_En: python finetune_bert.py --btype bertweet --dset clef_en --bs 4
For LESA: python finetune_bert.py --btype bertweet --dset lesa
For MediaEval: python finetune_bert.py --btype covid_twitter --dset mediaeval
For CLEF_Ar: python finetune_bert.py --btype arabert --dset clef_ar

VilBert Fine-tuning

Using pooled token embeddings, python finetune_vilbert.py --model image_ret --dset lesa --split 0 --un_fr 2 --pool add
Using averaged token embeddings, python finetune_vilbert2.py --model image_ret --dset lesa --split 0 --un_fr 2

If you find the shared resources useful, please cite:

@inproceedings{DBLP:conf/www/CheemaHME21,
  author    = {Gullal S. Cheema and
               Sherzod Hakimov and
               Eric M{\"{u}}ller{-}Budack and
               Ralph Ewerth},
  title     = {On the Role of Images for Analyzing Claims in Social Media},
  booktitle = {Proceedings of the 2nd International Workshop on Cross-lingual Event-centric
               Open Analytics co-located with the 30th The Web Conference {(WWW}
               2021), Ljubljana, Slovenia, April 12, 2021 (online event due to {COVID-19}
               outbreak)},
  series    = {{CEUR} Workshop Proceedings},
  volume    = {2829},
  pages     = {32--46},
  publisher = {CEUR-WS.org},
  year      = {2021},
  url       = {http://ceur-ws.org/Vol-2829/paper3.pdf}
}

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
arabert @ a98063f		arabert @ a98063f
feature_extraction		feature_extraction
svm_training		svm_training
vilbert-multi-task @ 9d64508		vilbert-multi-task @ 9d64508
vilbert_code		vilbert_code
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
finetune_bert.py		finetune_bert.py
translate_ar2en.py		translate_ar2en.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arabert @ a98063f

arabert @ a98063f

feature_extraction

feature_extraction

svm_training

svm_training

vilbert-multi-task @ 9d64508

vilbert-multi-task @ 9d64508

vilbert_code

vilbert_code

.gitmodules

.gitmodules

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

finetune_bert.py

finetune_bert.py

translate_ar2en.py

translate_ar2en.py

Repository files navigation

On the Role of Images for Analyzing Claims in Social Media

Setup

For SVM Training and BERT Fine-tuning

For VilBert Extraction and Fine-tuning

Data and Feature Extraction

SVM Training and Evaluation

BERT Fine-tuning

VilBert Fine-tuning

About

Releases

Packages

Contributors 2

Languages

License

cleopatra-itn/image_text_claim_detection

Folders and files

Latest commit

History

Repository files navigation

On the Role of Images for Analyzing Claims in Social Media

Setup

For SVM Training and BERT Fine-tuning

For VilBert Extraction and Fine-tuning

Data and Feature Extraction

SVM Training and Evaluation

BERT Fine-tuning

VilBert Fine-tuning

About

Topics

Resources

License

Stars

Watchers

Forks

Languages