This is the official implementation of the paper "Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge" (BMVC'20, oral).
Please install the following packages via pip
or conda
:
- Python 3
- Pytorch > 1.0
- torchvision > 0.3
- pycocotools
- mlflow
- numpy
- pickle
- tqdm
Remember to change the data paths in corresponding yaml.
- Download mscoco 2014 data from:
- Extract mscoco data and make sure files are in the following structure:
mscoco/
|-- annoataions/
|-- instances_train2014.json
|-- instances_val2014.json
|-- train2014/
|-- xxxx.jpg
|-- val2014/
|-- xxxx.jpg
-
Download data from the official NUS-WIDE website:
-
Extract the downloaded files into the following structure, and put
data/nus_wide/crawl_nuswide_image.py
under the data root:
nus_wide/
|-- annoataions/
|-- Concepts81.txt
|-- ImageList/
|-- NUS_WID_Tags/
|-- images/
|-- crawl_nuswide_image.py
|-- NUS-WIDE-urls.txt
Then download images by running python crawl_nuswide_image.py
.
- Download annoataions from Here, and extract it under
./data/visual_genome
- Download images from part1 and part2, and extract them into a single directory (e.g.,
./data/visual_genome/VG_100K
) - Change yamls, remember that
vg_root
points to the root of visual_genome, not VG_100K
- Run
sh start_mlflow.sh
- Run
./scripts/run_mlzsl_posVAE_coco.sh
for COCO dataset, - Open a browser and go to
localhost:5000
to see results
- All runnalbe scripts are under
./scripts
, and you can change the hyper-parameters by looking into which yaml file the script uses. - For fast0tag, simply run
python main_fast0tag_nus81.py
, remember to change the gpu to use in the python file.
- modify
scripts/fast0tag.yaml
, especially setloss: rank
. - run
./scripts/run_fast0tag.sh 0
, where0
is the GPU to use.
- modify
scripts/fast0tag.yaml
, especially setloss: bce
. - run
./scripts/run_fast0tag.sh 0
, where0
is the GPU to use.
- Multi-label zero-shot learning with structured knowledge graphs (CVPR’2018)
- modify
scripts/skg.yaml
, then run./scripts/run_skg.sh 0
, where0
is the GPU to use.
Can try different loss
functions, either bce
or rank
, where rank
is a contrasting loss used in Fast0Tag.
./scripts/run_gcn_posVAE_coco.sh
./scripts/run_gcn_posVAE_nus.sh
./scripts/run_gcn_posVAE_vg.sh
@article{huang2020multi,
title={Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge},
author={Huang, He and Chen, Yuanwei and Tang, Wei and Zheng, Wenhao and Chen, Qing-Guo and Hu, Yao and Yu, Philip},
journal={BMVC},
year={2020}
}