MeCoQ: Contrastive Quantization with Code Memory for Unsupervised Image Retrieval

[toc]

1. Introduction

This repository provides the code for our paper at AAAI 2022 (Oral):

Contrastive Quantization with Code Memory for Unsupervised Image Retrieval. Jinpeng Wang, Ziyun Zeng, Bin Chen, Tao Dai, Shu-Tao Xia. [arXiv].

We proposed MeCoQ, an unsupervised deep quantization method for image retrieval. Different from reconstruction-based methods that learn to preserve pairwise similarity information in continuous embeddings, MeCoQ learns quantized representation via contrastive learning. To boost contrastive learning, MeCoQ leverages a quantization code memory during training. Experiments on CIFAR-10 (under two evaluation protocols), Flickr-25K, and NUS-WIDE datasets demonstrate the effectiveness of MeCoQ.

In the following, we will guide you how to use this repository step by step. 🤗

2. Preparation

git clone https://github.com/gimpong/AAAI22-MeCoQ.git
cd AAAI22-MeCoQ/

2.1 Requirements

python 3.7.9
numpy 1.19.1
pandas 1.0.5
pytorch 1.3.1
torchvision 0.4.2
pillow 8.0.0
python-opencv 3.4.2
tqdm 4.51.0

2.2 Download the image datasets and organize them properly

Before running the code, we need to make sure that everything needed is ready. First, the working directory is expected to be organized as below:

AAAI22-MeCoQ/

data/

Flickr25k/

img.txt
targets.txt

Nuswide/

database.txt
test.txt
train.txt

datasets/

CIFAR-10/

cifar-10-batches-py/

batches.meta
data_batch_1
...

Flickr25K/

mirflickr/

im1.jpg
im2.jpg
...

NUS-WIDE/

Flickr/

actor/

0001_2124494179.jpg
0002_174174086.jpg
...

administrative_assistant/

...

...

scripts/

run0001.sh
run0002.sh
...

main.py
engine.py
data.py
utils.py
loss.py

Notes

The data/ folder is the collection of data splits for Flickr25K and NUS-WIDE datasets. The raw images of Flickr25K and NUS-WIDE datasets should be downloaded additionally and arranged in datasets/Flickr25K/ and datasets/NUS-WIDE/ respectively. Here we provide copies of these image datasets, you can download them via Google Drive or Baidu Wangpan (Web Drive, password: n307).
For experiments on CIFAR-10 dataset, you can use the option --download_cifar10 when running main.py.

3. Train and then evaluate

To facilitate reproducibility, we provide the scripts with configurations for each experiment. These scripts can be found under the scripts/ folder. For example, if you want to train and evaluate a 16-bit MeCoQ model on Flickr25K dataset, you can do

cd scripts/
# '0' is the id of GPU
bash run0001.sh 0

The script run0001.sh includes the running commands:

#!/bin/bash
cd ..
python main.py \
    --notes Flickr16bits \
    --device cuda:$1 \
    --dataset Flickr25K \
    --trainable_layer_num 0 \
    --M 2 \
    --feat_dim 32 \
    --T 0.4 \
    --hp_beta 1e-1 \
    --hp_lambda 0.5 \
    --mode debias --pos_prior 0.15 \
    --queue_begin_epoch 5 \
    --topK 5000
cd -

After running a script, a series of folders and files will be saved under logs/ and checkpoints/, whose file identifiers are consistent with the argument --notes in run0001.sh (e.g., Flickr16bits).

Under logs/ , there will be a log file (e.g., Flickr16bits.log) and a folder of tensorboard files (e.g., Flickr16bits).

Under checkpoints/, there will be a folder (e.g., Flickr16bits/) of information for the final checkpoint, including quantization codes (db_codes.npy) and labels (db_targets.npy) for the database set, model checkpoint (model.cpt), performance records (P_at_topK_curve.txt and PR_curve.txt).

⚠️Warning: the difficulty in reproducing exactly the same results on different software and hardware architectures 🤔

Initially, we tuned different experiments (e.g., different datasets and different quantization code lengths) separately on different servers in the authors' lab. These servers are equipped with 3 kinds of GPUs: NVIDIA® GeForce® GTX 1080 Ti (11GB), NVIDIA® GeForce® GTX 2080 Ti (11GB) and NVIDIA® Tesla® V100 (32 GB).

During our preparation for code releasing, we accidentally found that even with the same code and same hyper-parameter configuration (including the fixed random seeds), executing experiments on different servers can still yield different results. Such results may be influenced by various factors, e.g., the versions of drivers, libraries, and hardware architectures.

Unfortunately, we were not aware of this phenomenon during our paper submission and the reported results were based on mixed architectures. 😩

Here we report the results of running the scripts on three kinds of GPUs in the following table. We have also uploaded the logs and checkpoint information for reference, which can be downloaded from Baidu Wangpan (Web Drive), password: ncw0.

Script	Dataset	Code Length / bit	Distance Computation	GTX 1080 Ti		GTX 2080 Ti		V100
Script	Dataset	Code Length / bit	Distance Computation	MAP	log	MAP	log	MAP	log
run0001.sh	Flickr25K	16	Asymmetric	81.3137	Flickr16bits.log	81.2682	Flickr16bits.log	81.6233	Flickr16bits.log
run0002.sh		16	Symmetric	79.9250	Flickr16bitsSymm.log	80.0099	Flickr16bitsSymm.log	80.3065	Flickr16bitsSymm.log
run0003.sh		32	Asymmetric	82.3116	Flickr32bits.log	81.9112	Flickr32bits.log	81.0789	Flickr32bits.log
run0004.sh		32	Symmetric	81.5173	Flickr32bitsSymm.log	81.1909	Flickr32bitsSymm.log	80.4656	Flickr32bitsSymm.log
run0005.sh		64	Asymmetric	82.6785	Flickr64bits.log	81.7833	Flickr64bits.log	78.2403	Flickr64bits.log
run0006.sh		64	Symmetric	82.2351	Flickr64bitsSymm.log	81.2302	Flickr64bitsSymm.log	77.0577	Flickr64bitsSymm.log
run0007.sh	CIFAR-10 (I)	16	Asymmetric	68.8245	CifarI16bits.log	68.3206	CifarI16bits.log	69.0129	CifarI16bits.log
run0008.sh		16	Symmetric	65.9515	CifarI16bitsSymm.log	65.0148	CifarI16bitsSymm.log	66.1888	CifarI16bitsSymm.log
run0009.sh		32	Asymmetric	70.2410	CifarI32bits.log	69.9876	CifarI32bits.log	70.3119	CifarI32bits.log
run0010.sh		32	Symmetric	69.1810	CifarI32bitsSymm.log	68.7357	CifarI32bitsSymm.log	69.1754	CifarI32bitsSymm.log
run0011.sh		64	Asymmetric	70.2445	CifarI64bits.log	70.2884	CifarI64bits.log	70.2405	CifarI64bits.log
run0012.sh		64	Symmetric	69.4085	CifarI64bitsSymm.log	69.3631	CifarI64bitsSymm.log	69.3487	CifarI64bitsSymm.log
run0013.sh	CIFAR-10 (II)	16	Asymmetric	62.8279	CifarII16bits.log	61.8231	CifarII16bits.log	62.5369	CifarII16bits.log
run0014.sh		16	Symmetric	60.3927	CifarII16bitsSymm.log	59.5196	CifarII16bitsSymm.log	60.0741	CifarII16bitsSymm.log
run0015.sh		32	Asymmetric	64.0929	CifarII32bits.log	64.1100	CifarII32bits.log	63.1728	CifarII32bits.log
run0016.sh		32	Symmetric	62.1983	CifarII32bitsSymm.log	62.4287	CifarII32bitsSymm.log	61.4763	CifarII32bitsSymm.log
run0017.sh		64	Asymmetric	65.0706	CifarII64bits.log	63.8214	CifarII64bits.log	64.6805	CifarII64bits.log
run0018.sh		64	Symmetric	63.8469	CifarII64bitsSymm.log	62.8956	CifarII64bitsSymm.log	63.2863	CifarII64bitsSymm.log
run0019.sh	NUS-WIDE	16	Asymmetric	76.3282	Nuswide16bits.log	78.1548	Nuswide16bits.log	78.8492	Nuswide16bits.log
run0020.sh		16	Symmetric	75.8496	Nuswide16bitsSymm.log	77.0711	Nuswide16bitsSymm.log	78.0642	Nuswide16bitsSymm.log
run0021.sh		32	Asymmetric	82.1629	Nuswide32bits.log	82.1288	Nuswide32bits.log	82.3119	Nuswide32bits.log
run0022.sh		32	Symmetric	81.1774	Nuswide32bitsSymm.log	81.1331	Nuswide32bitsSymm.log	81.2273	Nuswide32bitsSymm.log
run0023.sh		64	Asymmetric	83.0987	Nuswide64bits.log	83.0466	Nuswide64bits.log	83.0686	Nuswide64bits.log
run0024.sh		64	Symmetric	82.0026	Nuswide64bitsSymm.log	82.2323	Nuswide64bitsSymm.log	82.2421	Nuswide64bitsSymm.log

4. References

If you find this code useful or use the toolkit in your work, please consider citing:

@inproceedings{wang22mecoq,
  author={Wang, Jinpeng and Zeng, Ziyun and Chen, Bin and Dai, Tao and Xia, Shu-Tao},
  title={Contrastive Quantization with Code Memory for Unsupervised Image Retrieval},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

5. Acknowledgements

Our code is based on the implementation of PyTorch SimCLR, MoCo, DCL, Deep-Unsupervised-Image-Hashing and CIBHash.

6. Contact

If you have any question, you can raise an issue or email Jinpeng Wang (wjp20@mails.tsinghua.edu.cn). We will reply you soon.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
1080Ti_reference_logs		1080Ti_reference_logs
2080Ti_reference_logs		2080Ti_reference_logs
V100_reference_logs		V100_reference_logs
checkpoints		checkpoints
data		data
figs		figs
logs		logs
scripts		scripts
README.md		README.md
data.py		data.py
engine.py		engine.py
loss.py		loss.py
main.py		main.py
network.py		network.py
utils.py		utils.py

gimpong/AAAI22-MeCoQ

Folders and files

Latest commit

History

Repository files navigation

MeCoQ: Contrastive Quantization with Code Memory for Unsupervised Image Retrieval

1. Introduction

2. Preparation

2.1 Requirements

2.2 Download the image datasets and organize them properly

Notes

3. Train and then evaluate

⚠️Warning: the difficulty in reproducing exactly the same results on different software and hardware architectures 🤔

4. References

5. Acknowledgements

6. Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages