Skip to content

Latest commit

 

History

History
25 lines (19 loc) · 1.43 KB

README.md

File metadata and controls

25 lines (19 loc) · 1.43 KB

Text-to-Image

A model for synthesising photo-realistic images given their textual descriptions.
Related research paper: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Model Architecture

Stacked Generative Adversarial Network or StackGAN is an architecture that aims at generating 256x256 photo-realistic images conditioned on their textual discription.
The complete architecture is composed of 2 GAN models:

Stage-I GAN

Given the encoded representation of textual description of the image we want to generate, the Stage-I GAN generates 64x64 primitive, low-resolution image.

Stage-II GAN

The Stage-II GAN uses the output of the Stage-I GAN and the textual description as the input and generates a 256x256 dimensional image with photo-realiistic details.

StackGAN Architecture

Dataset

The dataset used is CUB dataset, which contains 200 bird species with 11,788 images.

References