Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KLD calculation #1

Open
dksifoua opened this issue Apr 13, 2020 · 3 comments
Open

KLD calculation #1

dksifoua opened this issue Apr 13, 2020 · 3 comments

Comments

@dksifoua
Copy link

Hi,

I think there's an error in your KLD calculation.

This is what you wrote:

# see Appendix B from VAE paper:
# Kingma and Welling. Auto-Encoding Variational Bayes. ICLR, 2014
# https://arxiv.org/abs/1312.6114
# 0.5 * sum(1 + log(sigma^2) - mu^2 - sigma^2)
KLD = -0.5 * torch.mean(torch.mean(1 + logvar - mu.pow(2) - logvar.exp(), 1))

Instead of (what I think it should be)

KLD = -0.5 * torch.mean(torch.sum(1 + logvar - mu.pow(2) - logvar.exp(), 1))

Let me know if I'm right.

Also, could you explain me why you multiply KLD by 0.1 ?
Is that same as multiply BCE a big number? say 1000 for eg?

@botkevin
Copy link

botkevin commented Dec 2, 2020

Pretty sure he multiplies the KLD by .1 because that is his KLD weight hyperparameter

@botkevin
Copy link

botkevin commented Dec 2, 2020

Also, while working on a VAE that I wrote based on this, if I change the offending mean to a sum, my recon loss is much higher (2x), while my kl_loss starts as similar, but decreases faster with torch.mean; and all my reconstructed images are basically the same blurry image. I have no idea why this would change the reconstruction so much... I will have to do some investigation.

@botkevin
Copy link

botkevin commented Dec 3, 2020

Mean is equivalent to sum, it's just a scalar difference. Normally, using adam, if we don't have a composite loss, this scale doesn't matter, so if I change the sum to mean, the program should backpropagate the same. I changed the mean to a sum and decreased the kld weight, which fixed my problem. Basically when I changed the mean to a sum, I put too much weight on the kl weight and caused the latent distributions to be too strongly bound to the normal guassian.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants