Skip to content

WuJie1010/Fine-Grained-Image-Captioning

Repository files navigation

Fine-Grained-Image-Captioning

The pytorch implementation for "Fine-Grained Image Captioning with Global-Local Discriminative Objective"

Requirements:

Download MSCOCO dataset

  • Download the coco images from http://cocodataset.org/#download. Download 2014 Train images and 2014 Val images, and put them into the train2014/ and val2014/ in the ./image. Download 2014 Test images, and put them into the test2014/

Download COCO captions and preprocess them

Pre-extract the image features

  • python scripts/prepro_feats.py --input_json data/dataset_coco.json --images_root image

Prepare for Reinforcement Learning

  • Download Cider from: https://github.com/vrama91/cider And put "ciderD_token.py" and "ciderD_scorer_token4.py" in the "cider/pyciderevalcap/ciderD/", then
  • python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Prepare for training

Start training

Training using MLE criterion in the initial 20 epochs

  • python MLE_trainpro.py --id TDA --caption_model TDA --checkpoint_path RL_TDA

Training by Global-Local Discriminative Objective

Eval

  • python evalpro.py --caption_model TDA --checkpoint_path RL_TDA

Self-retrieval Experiment

  • python generate_random_5000.py --caption_model TDA --checkpoint_path RL_TDA
  • python self_retrieval.py --id TDA --caption_model TDA --checkpoint_path RL_TDA

About

The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages