Skip to content

ap229997/Conditional-Batch-Norm

Repository files navigation

Conditional Batch Normalization

Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language" [Link]

Introduction

The authors present a novel approach to incorporate language information into extracting visual features by conditioning the Batch Normalization parameters on the language. They apply Conditional Batch Normalization (CBN) to a pre-trained ResNet and show that this significantly improves performance on visual question answering tasks.

Setup

This repository is compatible with python 2.

  • Follow instructions outlined on PyTorch Homepage for installing PyTorch (Python2).
  • The python packages required are nltk tqdm which can be installed using pip.

Data

To download the VQA dataset please use the script 'scripts/vqa_download.sh':

scripts/vqa_download.sh `pwd`/data

Process Data

Detailed instructions for processing data are provided by GuessWhatGame/vqa.

Create dictionary

To create the VQA dictionary, use the script preprocess_data/create_dico.py.

python preprocess_data/create_dictionary.py --data_dir data --year 2014 --dict_file dict.json

Create GLOVE dictionary

To create the GLOVE dictionary, download the original glove file and run the script preprocess_data/create_gloves.py.

wget http://nlp.stanford.edu/data/glove.42B.300d.zip -P data/
unzip data/glove.42B.300d.zip -d data/
python preprocess_data/create_gloves.py --data_dir data --glove_in data/glove.42B.300d.txt --glove_out data/glove_dict.pkl --year 2014

Train Model

To train the network, set the required parameters in config.json and run the script main.py.

python main.py --gpu gpu_id --data_dir data --img_dir images --config config.json --exp_dir exp --year 2014

Citation

If you find this code useful, please consider citing the original work by authors:

@inproceedings{de2017modulating,
author = {Harm de Vries and Florian Strub and J\'er\'emie Mary and Hugo Larochelle and Olivier Pietquin and Aaron C. Courville},
title = {Modulating early visual processing by language},
booktitle = {Advances in Neural Information Processing Systems 30},
year = {2017}
url = {https://arxiv.org/abs/1707.00683}
}

About

Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published