Skip to content

totalgood/viddesc

Repository files navigation

viddesc

Real-time video description chatbot for visual and cognitive assistance.

Description

Uses a PyTorch video description model trained on COCO, YouTube8M, and MPII Movie Description datasets. The video description model is based on the pytorch advanced tutorial on image description. Uses [nlpi](http://github.com/totalgood/nlpia) for dialog management.

References

  • <a href="https://arxiv.org/pdf/1611.07810.pdf">"A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering"</a> by Maharaj et al at <a href="mpi-inf.mpg.de">MPII</a>.
  • <a href="https://arxiv.org/pdf/1502.08029.pdf">"Describing Videos by Exploting Temporal Structure"</a> by Yao et al at the University of Montreal.

About

Video Description Dataset downloaders and preprocessors

Resources

License

Stars

Watchers

Forks

Packages

No packages published