Skip to content

seungjun45/image-captioning-bottom-up-top-down

Repository files navigation

Image captioning using Bottom-up, Top-down Attention

this is a python3.0 (for generating hdf files) version of : https://github.com/poojahira/image-captioning-bottom-up-top-down

Requirements

python 3.6
torch 0.4.1
h5py 2.8
tqdm 4.26
nltk 3.3

Data preparation

Download the MSCOCO Training (13GB) and Validation (6GB) images.

Also download Andrej Karpathy's training, validation, and test splits. This zip file contains the captions.

Unzip all files and place the folders in directory you want (you need to change the file directory names in 'address_server_caption.py'.)


Next, download the bottom up image features : train_val test .

Unzip the folder and place unzipped folder in any folder (you need to change the file directory names in 'address_server_caption.py'.)


Next type this command in a python 3 environment:

python bottom-up_features/tsv.py
python bottom-up_features/tsv_test.py

This command will create the following files -

  • An HDF5 file containing the bottom up image features for train, val, and test splits, 36 per image for each split, in an I, 36, 2048 tensor where I is the number of images in the split.
  • PKL files that contain training and validation image IDs mapping to index in HDF5 dataset created above.

Move these files to the folder (any directory you want, just change the 'address_server_caption.py').

About

(modified) python3.0 version of bottom-up feature extractor

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages