Skip to content
/ vc Public

Code for CVPR'18 "Grounding Referring Expressions in Images by Variational Context"

Notifications You must be signed in to change notification settings

yuleiniu/vc

Repository files navigation

Grounding Referring Expressions in Images by Variational Context

This repository contains the code for the following paper:

  • Hanwang Zhang, Yulei Niu, Shih-Fu Chang, Grounding Referring Expressions in Images by Variational Context. In CVPR, 2018. (PDF)
@article{zhang2018grounding,
  title={Grounding Referring Expressions in Images by Variational Context},
  author={Zhang, Hanwang and Niu, Yulei and Chang, Shih-Fu},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Note: part of this repository is built upon cmn, speaker_listener_reinforcer and refer.

Requirements and Dependencies

# Make sure to clone with --recursive
git clone --recursive https://github.com/yuleiniu/vc.git

The recursive will help also clone the refer API and cmn API repo.

  • Install other dependencies by simply run:
  pip install -r requirements.txt

Preprocessing

  • Download the model weights of Faster-RCNN VGG-16 network converted from Caffe model:
  ./data/models/download_vgg_params.sh
  • Download the GloVe matrix for word embedding:
  ./data/word_embedding/download_embed_matrix.sh
  • Re-build the NMS lib and the ROIPooling operation following cmn. Simply run:
  ./submodule/cmn.sh
  • Preprocess data for the use of referring expression following speaker_listener_reinforcer and refer (implemented by Python 2) , and save the results into data/raw. Simply run:
  ./submodule/refer.sh

Extract features

  • Extract region features for RefCOCO/RefCOCO+/RefCOCOg, run:
  python prepare_data.py --dataset refcoco  #(for RefCOCO)
  python prepare_data.py --dataset refcoco+ #(for RefCOCO+)
  python prepare_data.py --dataset refcocog #(for RefCOCOg)

Train

  • To train the model under supervised setting, run:
  python train.py --dataset refcoco  #(for RefCOCO)
  python train.py --dataset refcoco+ #(for RefCOCO+)
  python train.py --dataset refcocog #(for RefCOCOg)
  • To train the model under unsupervised setting, run:
  python train.py --dataset refcoco  --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO)
  python train.py --dataset refcoco+ --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO+)
  python train.py --dataset refcocog --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCOg)

Evaluation

  • To test the model, run:
  python test.py --dataset refcoco  (for RefCOCO)  --checkpoint /path/to/checkpoint
  python test.py --dataset refcoco+ (for RefCOCO+) --checkpoint /path/to/checkpoint
  python test.py --dataset refcocog (for RefCOCOg) --checkpoint /path/to/checkpoint

About

Code for CVPR'18 "Grounding Referring Expressions in Images by Variational Context"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published