Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training VQVAE dose not convergence. #71

Open
henanjun opened this issue Nov 29, 2022 · 4 comments
Open

Training VQVAE dose not convergence. #71

henanjun opened this issue Nov 29, 2022 · 4 comments

Comments

@henanjun
Copy link

  • This is the reconstruction result of 100 Epoch of VQVAE.
    recons_VQVAE_Epoch_99
@blade-prayer
Copy link

I found that in vq_vae.yaml, the scheduler_gamma is set to be 0.0. This parameter controls the multiplicative factor in torch.optim.lr_scheduler.ExponentialLR, which makes the learning rate to be 0 after epoch 0. Do you think this is the reason?

@imskull
Copy link

imskull commented Dec 9, 2022

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

@xjtupanda
Copy link

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

Met the same problem. I found the loss always unreasonably high (~1.0e+6) and it might cause a gradient explosion. This one helps, thanks a lot.

@ohhh-yang
Copy link

In my training process the loss is more unreasonable(up to 1.0e+26!). And it just fluctuates like a pendulum. This one really works, you are my god!!!

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants