Neural Caption Generator

Tensorflow implementation of "Show and Tell" in the paper: http://arxiv.org/abs/1411.4555. The Show and Tell model is a deep neural network that learns how to describe the content of images.
Borrowed code and ideas from jazzsaxmafia's show_and_tell.tensorflow: https://github.com/jazzsaxmafia/show_and_tell.tensorflow. There are some modifications in model.py, see Code for details.
You need flickr30k data (images and annotations). You can put those in ImageCaption/data and ImageCaption/images folder respectively.

Install Required Packages

First ensure that you have installed the following required packages:

TensorFlow0.10.0rc0 (instructions)
Caffe (instructions)
Keras1.2.1 (instructions)
Natural Language Toolkit (NLTK):
- First install NLTK (instructions)
- Then install the NLTK data (instructions)

See requirements.txt for details.

Code

make_flickr_dataset.py : Extracting feats of flickr30k images, and save them in './data/feats.npy'.
- First, you shoule download the caffemodel and deploy.prototxt of VGG19. You can download those from here.
model.py : TensorFlow Version. There are some modifications in model.py:
- Add some command arguments, run more convenient.
- The test_single() in model.py is for a single image. If use_flickr=False, it just generate the caption of a image; If use_flickr=True, it will randomly pick a image and respective five reference captions from flickr30k dataset, generate the caption and calculate the BLEU Score.
- The test_multiple() in model.py is for multiple images. If use_flickr=False, it just generate the captions of some images; If use_flickr=True, it will randomly pick some images and respective five reference captions from flickr30k dataset, generate the captions and calculate the BLEU Scores.

Getting Started

Training a Model Run the training script.

python model.py --phase train

The checkpoint data will be stored in the model/tensorflow folder periodically.

Generating Captions and/or not Calculate BLEU Scores Your trained Show and Tell model can generate captions for any JPEG/PNG image! The following command line will generate captions for an image or some images.

python model.py --phase test_single --use_flickr False
python model.py --phase test_single --use_flickr True
# The script will generate the caption and/or not calculate the BLEU Score.
python model.py --phase test_multiple --use_flickr False
python model.py --phase test_multiple --use_flickr True
# The script will generate the captions and/or not calculate the BLEU Scores.

Downloading data/trained model

You might want to download flickr30k dataset(images and annotations) from here.
Extraced FC7 data: download. This is used in train() function in model.py. You can skip feature extraction part by using this.
Pretrained model: download. This is used in test_single() and test_multiple() in model.py. If you just want to check out captioning, download and test the model.
Tensorflow VGG net: download. This file is used in test_single() and test_multiple() in model.py.

License

BSD license

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
images_file		images_file
models/tensorflow		models/tensorflow
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cnn_util.py		cnn_util.py
make_flickr_dataset.py		make_flickr_dataset.py
model.py		model.py
requireme.txt		requireme.txt
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

images_file

images_file

models/tensorflow

models/tensorflow

results

results

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

cnn_util.py

cnn_util.py

make_flickr_dataset.py

make_flickr_dataset.py

model.py

model.py

requireme.txt

requireme.txt

util.py

util.py

Repository files navigation

Neural Caption Generator

Install Required Packages

Code

Getting Started

Downloading data/trained model

License

About

Releases

Packages

Languages

License

lyatdawn/Show-and-Tell

Folders and files

Latest commit

History

Repository files navigation

Neural Caption Generator

Install Required Packages

Code

Getting Started

Downloading data/trained model

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages