h-vae

Project based on https://arxiv.org/abs/1805.11328.

Used papers

Importance Weighted Autoencoder - https://arxiv.org/abs/1509.00519; Planar Normalizing Flows - https://arxiv.org/abs/1505.05770; Inverse Autoregressive Flows - https://arxiv.org/abs/1606.04934.

Project proposal

Link to the project proposal - https://drive.google.com/file/d/1-q50kvccrze68GvE1DEoaRx2Kq_54LeG/view?usp=sharing.

Requirements

We have run the experiments on Linux. The versions are given in brackets. The following packages are used in the implementation:

You can use pip or conda to install them.

Results

Gaussian Model

We compared all discussed methods for dimensions . Authors trained their models using optimization process for the whole dataset, but we found that HVAE results are better and training process is faster when the dataset is divided on batches. HVAE and normalizing flows were trained for iterations across dataset divided on batches with samples. For all experiments the dataset has points and training was done using RMSProp with a learning rate of and were conducted with fix random seed = . We average the results for predicted for different generated datasets according to Gaussian model and present the mean results in the following figures:

We trained Variational Bayes for big dimensions more iterations ( or ) due to the fact that iterations were not enough for the convergence of ELBO. HVAE with tempering and IAF have the best learned for the big dimensionality . Moreover, HVAE is good for prediction for all dimensions as well Variational Bayes scheme. However, Variationl Bayes suffers most on prediction as the dimension increases. Planar normalizing flows suffer on prediction compared to IAF.

Also we compare HVAE with tempering and without tempering, see figure:

We can see that the tempered methods perform better than their non-tempered counterparts; this shows that time-inhomogeneous dynamics are a key ingredient in the effectiveness of the method.

MNIST

We appeal to the binarized MNIST handwritten digit dataset as an example of image generative task. The training data has the following form: , where for . We then formalize the generative model:

for where is the component of is the latent variable associated with and is an encoder (convolutional neural network). The VAE approximate posterior is given by where and are separate outputs of the encoder parametrized by and is constrained to be diagonal.

We set the dimensionalty of the latent space to . As we need both means and variances to parametrize the VAE posterior distribution, the output dimension of the linear layer is set to . We use Adam optimizer with standard parameters and learning rate set to .

We compare the performance of HVAE with the performance of IWAE https://arxiv.org/abs/1509.00519. We set the number of Monte-Carlo steps in HVAE and the number of importance samples in IWAE so that training epoch requiers equal time to finish. In this setting we can compare them more fairly. We fix the number Monte-Carlo steps and the number of importance samples . Both models are then optimized for epochs. Corresponding plots are the following:

It can be clearly seen that the training loss values are similar for both models, while the validation loss of HVAE is higher due to overfitting.

It is also important to compare the models outputs in terms of quality. The generated images are shown in the following figures (top - HVAE, bottom - IWAE):

Images generated via IWAE appear to be more blured. However, HVAE tends to generate sharper images while some of them can not be recognized as digits.

To better understand the behavior of both models, we study the decoded latent vectors of HMC chains (top - HVAE, bottom - IWAE):

In these figures one can clearly see that HVAE encoded vectors often correspond to the class that is different from the ground-truth, even though they are sharper. At the same time, IWAE produces reconstructions that are close to the true images.

Details

More details about experiments and settings can be found in the report https://github.com/Daniil-Selikhanovych/h-vae/blob/master/report/hvae_report.pdf.

Pretrained Models

Models	Description
HVAE & IWAE on MNIST	models from experiments in demos/mnist.ipynb

LaTex in Readme

We have used readme2tex to render LaTex code in this Readme. Install the corresponding hook and change the command to fix the issue with broken paths:

python -m readme2tex --output README.md README.tex.md  --branch master --svgdir 'svgs' --nocdn

Our team

At the moment we are Skoltech DS MSc, 2019-2021 students.

Artemenkov Aleksandr
Karpikov Igor
Selikhanovych Daniil

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demos

demos

gaussians/api

gaussians/api

images

images

report

report

svgs

svgs

README.md

README.md

README.tex.md

README.tex.md

Repository files navigation

h-vae

Used papers

Project proposal

Requirements

Contents