GQA: Visual Reasoning in the Real World

Data structure

├── Question Number
    ├── Annotations
    |   ├── answer
    |   ├── full Answer
    |   └── question
    │   
    ├── answer
    ├── entailed
    ├── equivalent
    ├── fullAnswer
    ├── groups
    ├── imageId
    ├── isBalanced
    ├── question
    ├── semantic
    ├── semanticStr
    └── types
        ├── detailed
        ├── semantic
        └── structural

answer
imageId
question

Network Architecture

Image-Question Aggregator

Image Pretrained
- Tensornets github
Question Pretrained
- ELMo using tensorflow-hub
Attention model, We just use attention module
- Self-Attention Generative Adversarial Networks paper
- Attention github

Requirements

tensorflow-gpu==1.13.1
numpy==1.16.2
tensorflow-hub==0.4.0
python==3.7.3
cv2==4.0.0
tqdm==4.31.1

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
GQA		GQA
Pytorch-DataLoader-Example		Pytorch-DataLoader-Example
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GQA

GQA

Pytorch-DataLoader-Example

Pytorch-DataLoader-Example

README.md

README.md

Repository files navigation

GQA: Visual Reasoning in the Real World

Data structure

Network Architecture

Image-Question Aggregator

Requirements

About

Releases

Packages

Contributors 2

Languages

leaderj1001/Vision-Language

Folders and files

Latest commit

History

Repository files navigation

GQA: Visual Reasoning in the Real World

Data structure

Network Architecture

Image-Question Aggregator

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Languages