You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to train the model with the default settings on ADE20K dataset, but find that the performance of this model is lower than the given pretrained model (FID/mIoU/acc: 29/37/80 vs. 27/45/82). Since the random seed is fixed, I'm not sure why the performance of trained model is different with the pretrained one. The following are the losses of my experiment and the pretrained one:
Any idea why?
The text was updated successfully, but these errors were encountered:
Did you change the code after downloading the repo, so do you use exactly the version we released?
Do you use the exact copy of our pip/conda environment with correct versions of packages, for which our code was tested?
What is the batch size you use and on how many GPUs do you train?
What was the command you used to launch this experiment?
Could you please post your opt.txt for this experiment?
Regarding the ideas "why":
In the losses you posted there is a flat area in the beginning (up to 30k iterations). We observed such behavior in some of our ablations, particularly for the ones trained without the 3d noise. This happens due to a large Adam momentum we used (beta2=0.999). This value leads to a better convergence in the end, but without some parameter tuning may cause a slow down in the beginning . For our final model on Ade20k, we tuned parameters in a way that the optimizer setting does not cause this slow-down. As the seed is fixed, this should not be happening for a model with default settings.
That being said, could you please verify whether you did not change the way the 3d noise is injected in the model, or any of the hyperparameters?
Sure, the version of PyTorch I used is 1.6.0, not 1.0.0, and then I use 8 GPUs (but the batch size is still 32). I also change the EMA operation to speedup the training through avoiding the additional inference process (this change will not affect the loss I think).
So, it seems that the environment may cause this difference. I will try it again. Thank you!
Hi, great work!
I try to train the model with the default settings on ADE20K dataset, but find that the performance of this model is lower than the given pretrained model (FID/mIoU/acc: 29/37/80 vs. 27/45/82). Since the random seed is fixed, I'm not sure why the performance of trained model is different with the pretrained one. The following are the losses of my experiment and the pretrained one:
Any idea why?
The text was updated successfully, but these errors were encountered: