Deep Learning Code
In this code, we will have to train a caption generator model for images. It requires a CNN model to extract the content of images and an RNN model to learn the corresponding captions of the images. These two features are finally merged and trained using dense layers.
This notebook is divided into six parts.
- Download photo and caption dataset
- Prepare photo data
- Prepare text data
- Develop deep learning model with progressive data loading
- Evaluate model
- Generate new captions
The below research papers are for reference-
- Where to put the Image in an Image Caption Generator (https://arxiv.org/abs/1703.09137)
- Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics (https://www.ijcai.org/Proceedings/15/Papers/593.pdf)