Skip to content

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

License

Notifications You must be signed in to change notification settings

anandroid/pythia

 
 

Repository files navigation

This is an implementation for vision and language multimodal research developed on top of Pythia

Pythia

Model Implementation

We have proposed few major improvements over Pythia, which is an implementation for the LoRRA model. The dynamic answering space is expanded by adding an Instance segmentation module. We have also replaced the existing OCR with spell correcting OCR and add the spatial features of the OCR. We have also implemented n-gram to the modified OCR.

Pythia

Test

For testing run our notebook

About

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.9%
  • Other 1.1%