This repository consist of implementation of DCGANs on 2 datasets (celeba and MNIST) for generating completely new looking images of faces and numbers respectively.
TO understand GANs here is the small summary of first paper(https://arxiv.org/pdf/1406.2661.pdf)
GANs take a game-theoretical approach: learn to generate from 2 player games to come over the problem of the intractable density function. This model doesn’t assume explicit density function as in the case of VAEs.
In this, we have two types of networks,
1. Generator network:Try to fool discriminator by generating real looking images.
To learn the generator’s distribution pg over data x, we define a prior on input noise variables pg(z), then represent a mapping to data space as G(z; thetag), where G is a differentiable function represented by a multilayer perceptron with parameters g.
2. Discriminator network:Try to distinguish between real and fake images.
It is a second multilayer perceptron D(x;theta d) that outputs a single scalar. D(x) represents the probability that x came from the data rather than pg.
-
Discriminator(thetad ) try to maximize objective function such that D(x) is close to 1(real) and D(G(z)) is close to 0(fake).
D is trained so that we get the maximum value of log(Dtheta d(x)) that is internally trying to label dataset images as real, also D is maximizing log(1-Dtheta d(Gtheta g(z)) this internally trying to label generated images as fake images.
2.Generator (thetag ) try to minimize objective such that D(G(Z)) is close to 1.
It is trying to minimize log(1-Dtheta d(G theta g (z)) thus setting g (through gradient descents) such that it can fool discriminator. Rather than training G to minimize log(1 - D(G(z))) we can train G to maximize log(D(G(z)) because it provides steep gradients initially then it provides non steep.
Note: Objective function obtain minimum for a given generator when pg = pdata
1.There is no explicit representation of pg (x).
2.D must be well sync with G otherwise it could create problem.
Here paper summary of paper ends.
The notebook you see here is implemented in form of convolutional architecture popularly known as DCGANs.
Here is basoc architecture of DCGANs.
Here I implemented DCGANs on MNIST dataset and celeba(50000images due to lack of computational power)
Generator loss and discriminator loss are as follows:Training goes a follows:
Note: Here the model is not getting good results as it as facing mode collapse, so I am still improving this.
Discriminator and generator loss function are as follows:
Training goes a follows:
celeba result after 100th epoch
There are more advances done for getting better results on human faces for that you can have a look at style GANs https://arxiv.org/pdf/1812.04948.pdf