Skip to content

This projects aims to analyze interpretable controls for GAN models. Specifically, we study the GANSpace paper and explore improvements to their method while addressing the drawbacks of using PCA in GANSpace. Furthermore, we study the reasons for improvement in performance of the GAN on using Wasserstein metric instead of JS divergence or MLE.

KV9801/gan-control

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GANSpace - Discovering Interpretable GAN Controls

Using https://github.com/harskish/ganspace to find latent directions in a StyleGAN2 model trained on the pizza10 dataset.

Erik Härkönen1,2, Aaron Hertzmann2, Jaakko Lehtinen1,3, Sylvain Paris2
1Aalto University, 2Adobe Research, 3NVIDIA
https://arxiv.org/abs/2004.02546

Abstract: This paper describes a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day. We identify important latent directions based on Principal Components Analysis (PCA) applied in activation space. Then, we show that interpretable edits can be defined based on layer-wise application of these edit directions. Moreover, we show that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner. A user may identify a large number of interpretable controls with these mechanisms. We demonstrate results on GANs from various datasets.

Video: https://youtu.be/jdTICDa_eAI

Figure 1: The top 4 principal components obtained from PCA on the latent vectors lead to controlling corresponding features a) Top-most: size of the pizza, b) Second-largest: shape of the pizza since it gives out thinner slices towards the end, c) Thirdlargest: Amount of cheese could be varied by varying the third component d) Fourth largest: The amount of tomato sauce inMargherita type pizzas.

Usage

For setup and usage instructions, open the notebook in Google Colab: Open In Colab

SeFa - Closed-Form Factorization of Latent Semantics in GANs

Using https://github.com/rosinality/stylegan2-pytorch to discover meaningful latent semantic directions in an unsupervised manner using a StyleGAN2 model trained on the pizza10 dataset.

Yujun Shen, Bolei Zhou
The Chinese University of Hong Kong
https://arxiv.org/abs/2007.06600

Abstract: A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In order to identify such latent dimensions for image editing, previous methods typically annotate a collection of syn- thesized samples and train linear classifiers in the latent space. However, they require a clear definition of the target attribute as well as the corresponding manual annotations, limiting their applications in practice. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. In particular, we take a closer look into the gen- eration mechanism of GANs and further propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights. With a lightning-fast implementation, our approach is capable of not only finding semantically meaningful dimensions comparably to the state-of-the-art supervised methods, but also resulting in far more versatile concepts across multiple GAN models trained on a wide range of datasets.

Figure 2: Samples generated from latent code moved along the 8th eigenvector lead to controlling the amount of toppings on the pizza. For each vertical set of images, the middle one is the original output, while the top and the bottom are the output images generated by moving the latent with degree +5 and -5 respectively.

Usage

For setup and usage instructions, open the notebook in Google Colab: Open In Colab

About

This projects aims to analyze interpretable controls for GAN models. Specifically, we study the GANSpace paper and explore improvements to their method while addressing the drawbacks of using PCA in GANSpace. Furthermore, we study the reasons for improvement in performance of the GAN on using Wasserstein metric instead of JS divergence or MLE.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published