Skip to content

pha123661/Image-Captioning-and-Attention-Visualization

Repository files navigation

Image Captioning and Attention Visualization

Image captioning with pretrained DeiT v3 as encoder on a subset of MSCOCO dataset

  • CIDEr score: 0.9413
  • CLIP score: 0.7310

Attention map visualization for image captioning:

girl

See problem 2 & 3 in Report.pdf and Spec.pdf more details.

About

Image captioning with pretrained encoder on MSCOCO

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published