DCGAN implementation for learning.
- DCGAN Implementation using PyTorch
- How To Run
- Results
- Resources Used
- Notes
- Other Useful Resources
- TODO
- clone this repo using
git clone https://github.com/ashantanu/DCGAN.git
- download and unzip celeba dataset to the folder name 'celeba'. I used below snippet from a udacity google colab notebook
mkdir celeba && wget https://s3-us-west-1.amazonaws.com/udacity-dlnfd/datasets/celeba.zip
- control parameters in config.yml
- run main.py
Using 3 epochs on GPU
GPU Training animation
Loss (x axis unit is 100 iterations)
- Dataloader tutorial
- Transforms
- ImageFolder
- DCGAN Paper
- GAN Paper
- PyTorch Tutorial
- PyTorch Layers
- Convolutions: Guide to Convolutions or This Blog
- Weight Initialization and Pytorch functions for it
- Weights in BatchNorm, affine in batchnorm
- Why to use Detach in this code, and why is it not used in generator step: 1 and 2
- For reproducibility, manually set the random of pytorch and other python libraries. Refer this for reproducibility pytorch using CUDA.
- GAN notes here
- Transpose Convolution: Like an opposite of convolution. For ex. Maps 1x1 to 3x3.
- Upsampling: opposite of pooling. Fills in pixels by copying pixel values, using nearest neighbor or some other method.
- For keeping track of the generator’s learning progression, generate a fixed batch of latent vectors. We can pass it through generator to see visualize how generator improves.
- GAN Hacks
- Pytorch Autograd Tutorials
- Pytorch autograd works
- Google Colab: Keep Connected, add data, save model
- Why GANs are hard to train
snippet to keep colab running:
function ClickConnect(){
console.log("Clicked on connect button");
document.querySelector("colab-toolbar-button").click() // Change id here
}
setInterval(ClickConnect,60000)
- Check what is dilation in conv2d layer
- Check back-propagation in transpose convolution
- Weight initialization should use values from config
- Understand weight initialization in BatchNorm: how does it work?, what is affine used for?, how to initialize it properly
- Is there a choice to be made for sampling latent variable
- Check why 1024 layer skipped in tutorial