Grounding Referring Expressions in Images by Variational Context

This repository contains the code for the following paper:

Hanwang Zhang, Yulei Niu, Shih-Fu Chang, Grounding Referring Expressions in Images by Variational Context. In CVPR, 2018. (PDF)

@article{zhang2018grounding,
  title={Grounding Referring Expressions in Images by Variational Context},
  author={Zhang, Hanwang and Niu, Yulei and Chang, Shih-Fu},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Note: part of this repository is built upon cmn, speaker_listener_reinforcer and refer.

Requirements and Dependencies

Python 3 (Anaconda recommended)
TensorFlow (v1.3.0 or higher)
Clone

# Make sure to clone with --recursive
git clone --recursive https://github.com/yuleiniu/vc.git

The recursive will help also clone the refer API and cmn API repo.

Install other dependencies by simply run:

  pip install -r requirements.txt

Preprocessing

Download the model weights of Faster-RCNN VGG-16 network converted from Caffe model:

  ./data/models/download_vgg_params.sh

Download the GloVe matrix for word embedding:

  ./data/word_embedding/download_embed_matrix.sh

Re-build the NMS lib and the ROIPooling operation following cmn. Simply run:

  ./submodule/cmn.sh

Preprocess data for the use of referring expression following speaker_listener_reinforcer and refer (implemented by Python 2) , and save the results into data/raw. Simply run:

  ./submodule/refer.sh

Extract features

Extract region features for RefCOCO/RefCOCO+/RefCOCOg, run:

  python prepare_data.py --dataset refcoco  #(for RefCOCO)
  python prepare_data.py --dataset refcoco+ #(for RefCOCO+)
  python prepare_data.py --dataset refcocog #(for RefCOCOg)

Train

To train the model under supervised setting, run:

  python train.py --dataset refcoco  #(for RefCOCO)
  python train.py --dataset refcoco+ #(for RefCOCO+)
  python train.py --dataset refcocog #(for RefCOCOg)

To train the model under unsupervised setting, run:

  python train.py --dataset refcoco  --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO)
  python train.py --dataset refcoco+ --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO+)
  python train.py --dataset refcocog --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCOg)

Evaluation

To test the model, run:

  python test.py --dataset refcoco  (for RefCOCO)  --checkpoint /path/to/checkpoint
  python test.py --dataset refcoco+ (for RefCOCO+) --checkpoint /path/to/checkpoint
  python test.py --dataset refcocog (for RefCOCOg) --checkpoint /path/to/checkpoint

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
submodule		submodule
util		util
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
config.py		config.py
prepare_data.py		prepare_data.py
prepro.py		prepro.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
vc_model.py		vc_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

submodule

submodule

util

util

.gitignore

.gitignore

.gitmodules

.gitmodules

README.md

README.md

config.py

config.py

prepare_data.py

prepare_data.py

prepro.py

prepro.py

requirements.txt

requirements.txt

test.py

test.py

train.py

train.py

vc_model.py

vc_model.py

Repository files navigation

Grounding Referring Expressions in Images by Variational Context

Requirements and Dependencies

Preprocessing

Extract features

Train

Evaluation

About

Releases

Packages

Languages

yuleiniu/vc

Folders and files

Latest commit

History

Repository files navigation

Grounding Referring Expressions in Images by Variational Context

Requirements and Dependencies

Preprocessing

Extract features

Train

Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Languages