Text-to-Image

A model for synthesising photo-realistic images given their textual descriptions.
Related research paper: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Model Architecture

Stacked Generative Adversarial Network or StackGAN is an architecture that aims at generating 256x256 photo-realistic images conditioned on their textual discription.
The complete architecture is composed of 2 GAN models:

Stage-I GAN

Given the encoded representation of textual description of the image we want to generate, the Stage-I GAN generates 64x64 primitive, low-resolution image.

Stage-II GAN

The Stage-II GAN uses the output of the Stage-I GAN and the textual description as the input and generates a 256x256 dimensional image with photo-realiistic details.

Dataset

The dataset used is CUB dataset, which contains 200 bird species with 11,788 images.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
StackGAN.ipynb		StackGAN.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

StackGAN.ipynb

StackGAN.ipynb

Repository files navigation

Text-to-Image

Model Architecture

Stage-I GAN

Stage-II GAN

Dataset

References

About

Releases

Packages

Languages

savya08/StackGAN

Folders and files

Latest commit

History

README.md

README.md

StackGAN.ipynb

StackGAN.ipynb

Repository files navigation

Text-to-Image

Model Architecture

Stage-I GAN

Stage-II GAN

Dataset

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages