Replicate the pretrained model #8

tzt101 · 2021-05-06T09:02:11Z

Hi, great work!

I try to train the model with the default settings on ADE20K dataset, but find that the performance of this model is lower than the given pretrained model (FID/mIoU/acc: 29/37/80 vs. 27/45/82). Since the random seed is fixed, I'm not sure why the performance of trained model is different with the pretrained one. The following are the losses of my experiment and the pretrained one:

Any idea why?

SushkoVadim · 2021-05-06T14:22:13Z

Hi,

First of all, a couple of questions:

Did you change the code after downloading the repo, so do you use exactly the version we released?
Do you use the exact copy of our pip/conda environment with correct versions of packages, for which our code was tested?
What is the batch size you use and on how many GPUs do you train?
What was the command you used to launch this experiment?
Could you please post your opt.txt for this experiment?

Regarding the ideas "why":
In the losses you posted there is a flat area in the beginning (up to 30k iterations). We observed such behavior in some of our ablations, particularly for the ones trained without the 3d noise. This happens due to a large Adam momentum we used (beta2=0.999). This value leads to a better convergence in the end, but without some parameter tuning may cause a slow down in the beginning . For our final model on Ade20k, we tuned parameters in a way that the optimizer setting does not cause this slow-down. As the seed is fixed, this should not be happening for a model with default settings.

That being said, could you please verify whether you did not change the way the 3d noise is injected in the model, or any of the hyperparameters?

tzt101 · 2021-05-07T00:42:29Z

Sure, the version of PyTorch I used is 1.6.0, not 1.0.0, and then I use 8 GPUs (but the batch size is still 32). I also change the EMA operation to speedup the training through avoiding the additional inference process (this change will not affect the loss I think).

So, it seems that the environment may cause this difference. I will try it again. Thank you!

This is the opt.txt on my experiment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicate the pretrained model #8

Replicate the pretrained model #8

tzt101 commented May 6, 2021

SushkoVadim commented May 6, 2021

tzt101 commented May 7, 2021

Replicate the pretrained model #8

Replicate the pretrained model #8

Comments

tzt101 commented May 6, 2021

SushkoVadim commented May 6, 2021

tzt101 commented May 7, 2021