MNIST-GAN

MNIST dataset

The MNIST database is available at http://yann.lecun.com/exdb/mnist/

The MNIST database is a dataset of handwritten digits. It has 60,000 training samples, and 10,000 test samples. Each image is represented by 28x28 pixels, each containing a value 0 - 255 with its grayscale value.

It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

Thanks to Yann LeCun, Corinna Cortes, Christopher J.C. Burges.

The Discriminator architecture

The discriminator is going to be a typical linear classifier.
The activation function we will be using is Leaky ReLu.

Why leaky ReLu?

We should use a leaky ReLU to allow gradients to flow backward through the layer unhindered. A leaky ReLU is like a normal ReLU, except that there is a small non-zero output for negative input values.

The Generator architecture

The generator uses latent samples to make fake images. These latent samples are vectors which are mapped to the fake images.
The activation function for all the layers remains the same except we will be using Tanh at the output.

Why Tanh at the output?

The generator has been found to perform the best with 𝑡𝑎𝑛ℎtanh for the generator output, which scales the output to be between -1 and 1, instead of 0 and 1.

Scaling images

We output of the generator to be comparable to the real images pixel values, which are normalized values between 0 and 1. Thus, we'll also have to scale our real input images to have pixel values between -1 and 1 when we train the discriminator. This will be done during the training phase.

Generalization

To help the discriminator generalize better, the labels are reduced a bit from 1.0 to 0.9. For this, we'll use the parameter smooth; if True, then we should smooth our labels. In PyTorch, this looks like: labels = torch.ones(size) * 0.9
We also made use of dropout layers to avoid overfitting.

Loss calculation

The discriminator's goal is to output a 1 for real and 0 for fake images. On the other hand, the generator wants to make fake images that closely resemble the real ones.
Thus we can say if "D" represents the loss for the discriminator, then the following can be stated:

The goal of discriminator : D(real_images)=1 & D(fake_images)=0

The goal of generator: D(real_images)=0 & D(fake_images)=1

We will use BCEWithLogitsLoss, which combines a sigmoid activation function (we want the discriminator to output a value 0–1 indicating whether an image is real or fake) and binary cross-entropy loss.

Training

As mentioned earlier, Adam is a suitable optimizer.
The generator takes in a vector z and outputs fake images. The discriminator alternates between training on the real images and that of the fakes images produced by the generator.
Steps involved in discriminator training:

We first compute the loss on real images
Generate fake images
Compute loss on fake images
Add the loss of the real and fake images
Perform backpropagation and update weights of the discriminator

Steps involved in generator training:

Generate fake images
Compute loss on fake images with inversed labels
Perform backpropagation and update the weights of the generator.

Training loss

We shall plot generator and discriminator losses against the number of epochs.

Samples generated by the generator

At the start

Overtime

This way the generator starts out with noisy images and learns over time.

Conclusions

Since the time Ian Goodfellow and his colleagues at the University of Montreal designed GANs, they exploded with popularity. The number of applications is remarkable. GANs were further improved by many variations some of which are CycleGAN, Conditional GAN, Progressive GAN, etc.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data/MNIST		data/MNIST
README.md		README.md
mnist_gan.ipynb		mnist_gan.ipynb
train_samples.pkl		train_samples.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

data/MNIST

data/MNIST

README.md

README.md

mnist_gan.ipynb

mnist_gan.ipynb

train_samples.pkl

train_samples.pkl

Repository files navigation

MNIST-GAN

MNIST dataset

The Discriminator architecture

Why leaky ReLu?

The Generator architecture

Why Tanh at the output?

Scaling images

Generalization

Loss calculation

Training

Training loss

Samples generated by the generator

Conclusions

About

Languages

NvsYashwanth/MNIST-GAN

Folders and files

Latest commit

History

Repository files navigation

MNIST-GAN

MNIST dataset

The Discriminator architecture

Why leaky ReLu?

The Generator architecture

Why Tanh at the output?

Scaling images

Generalization

Loss calculation

Training

Training loss

Samples generated by the generator

Conclusions

About

Topics

Resources

Stars

Watchers

Forks

Languages