Skip to content

chiral-carbon/ImageCaptioning

Repository files navigation

Implementing image captioning with and without soft attention model on the Flickr8k dataset.

This is my implementation of the Show, Attend and Tell paper.

Taken assistance from the blogpost: https://machinelearningmastery.com/develop-a-deep-learning-caption-generation-model-in-python/

You can see my implementation at this Kaggle kernel. The attention model was not successfully implemented, which is why I trained my model without it.

The highest BLEU scores after 20 epochs were:

BLEU-1: 53.0076%

BLEU-2: 28.6551%

BLEU-3: 19.7607%

BLEU-4: 9.4241%

This is the first implementation and will be optimized further.

About

Image captioning with visual attention.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages