Skip to content

isi-vista/AIRD-Datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

AIRD-Datasets

This repository provides access to preprocessed datasets used in the paper A. Jaiswal, Y. Wu, W. AbdAlmageed, I. Masi, and P. Natarajan, "AIRD: Adversarial Learning Framework for Image Repurposing Detection" (Proceedings of CVPR, 2019). The paper presents an adversarial framework for image repurposing detection and offers the datasets used as a contribution to the community.

Image samples at a glance

Examples of evidences in the AIRD-Datasets. Samples above show supporting evidences under three different domains: (a) Places (Google Landmarks), (b) Faces (IJBC-IRD), and (c) Paintings (Painter by Numbers).

Examples of fake candidates in the AIRD-Datasets. Samples above show confusing fake-candidates useful in training and assessing image repurposing detection models under three different domains: (a) Places (Google Landmarks), (b) Faces (IJBC-IRD), and (c) Paintings (Painter by Numbers).

Summary of Datasets

Name Source Image Content Label (# unique) Training Size Testing Size Encoding
Google Landmarks Kaggle Indoor/Outdoor Scenes Landmark ID (13,885) 977,624 238,965 NetVLAD [1] + PCA + L2-norm
IJBC-IRD NIST Cropped & Aligned Faces Subject ID (1,649) 13,748 2,629 Face-ResNet [2] + PCA + Signed-Square Rooting
Painter by Numbers Kaggle Paintings Artist ID (1,000) 58,701 14,162 ConvNet + L2-norm

Download Links

Instructions

The download links provide access to compressed archives (.tar.gz files). Each of these can be uncompressed using:

$ tar xvzf <filename>.tar.gz

This would create a directory with files for the encodings, the labels, and the precomputed similarity-based retrievals:

Filename Description File-type Loading in Python
encoding_<split>.h5 Image Encodings HDF5 h5py
metadata_<split>.csv Labels CSV pandas
precomputed_retrievals_<split>.npy Retrieval Indices NumPy Binary NumPy

where <split> takes values: train and test.

Precomputed Retrievals: In our experiments, we treated the training data as the reference world dataset as well. Hence, all retrieval entries are indices into the training encoding and metadata files.

Citation

Please cite our paper with the following bibtex if you use any of these:

@InProceedings{jaiswal2019aird,
    author = {Jaiswal, Ayush and Wu, Yue and AbdAlmageed, Wael and Masi, Iacopo and Natarajan, Premkumar},
    title = {{AIRD: Adversarial Learning Framework for Image Repurposing Detection}},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2019}
} 

References

[1] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016.

[2] I. Masi, A. T. Tran, T. Hassner, G. Sahin and G. Medioni. Face-Specific Data Augmentation for Unconstrained Face Recognition, International Journal of Computer Vision (IJCV), 2019.

Disclaimer

We do not claim ownership for the original source data. We only provide encodings of the images and relevant labels to further research in image repurposing detection and semantic integrity assessment of multimedia data.

Contact

If you have any questions, drop an email to ajaiswal@isi.edu or iacopo@isi.edu.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published