Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RenyiELBO/IWAE fail to converge on AIR example #3289

Open
horizon-blue opened this issue Oct 31, 2023 · 3 comments
Open

RenyiELBO/IWAE fail to converge on AIR example #3289

horizon-blue opened this issue Oct 31, 2023 · 3 comments

Comments

@horizon-blue
Copy link

Issue Description

Hello Pyro folks. I was trying to use IWAE in the AIR example to see if the stricter lower bound yields better performance than the standard ELBO. However, after swapping out the elbo method with RenyiELBO(alpha=0), the model fails to converge completely. As a diagnostic, I also tried setting num_particles=1 to see if it at least falls back to the standard ELBO behavior, but the accuracy of the AIR model still does not improve at all. After reading #2220, I also tried reducing batch_size=1, yet there's no change in the performance of the model either.

I'm wondering if any of you might have some insights on what could cause the performance discrepancy between RenyiELBO vs TraceGraph_ELBO? Thank you very much :)

epoch vs accuracy
epoch vs -ELBO

Environment

For any bugs, please provide the following:

  • Platform: MacOS 14.0, Python 3.11.5
  • Pyro 1.8.6
  • PyTorch 2.1.0

Code Snippet

The issue could be reproduced by running the AIR Example in Pyro's codebase and replacing this elbo setting with RenyiELBO().

@martinjankowiak
Copy link
Collaborator

we would need to get something like #3123 merged.

what happens when you use TraceGraphELBO with multiple particles?

@horizon-blue
Copy link
Author

Thanks for the reply @martinjankowiak .

I pull the changes from #3123, but sadly, it doesn't seem to fix the issue (and the ELBO is even worse). I also include the result of TraceGraph_ELBO with two particles:
Epoch vs accuracy
Epoch vs -ELBO

@martinjankowiak
Copy link
Collaborator

oh sorry i wasn't thinking clearly when i first read this. AIR has discrete latent variables. to deal with that you can either sum them out (not really viable here) or use a stochastic gradient estimator. TraceGraphELBO uses a fancier and thus much lower variance gradient estimator that makes use of the fine-grained conditional independent structure of the model. RenyiELBO cannot do this and so results in a much higher variance gradient estimator---actually so much higher that it's evidently not usable. so this is expected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants