Loss weights for optimal training #16

parthnatekar · 2023-03-01T03:21:01Z

Hi,

I had a question about the optimal weighting of the losses during training.

I notice that initially if you weigh the fc loss, vq_loss, and reconstruction loss equally, the quantizer is not trained well enough to provide meaningful outputs, and the network never learns a good codebook because it is biased too much towards prediction. Reducing the prediction loss, on the other hand, makes the network learn a representation which is good for reconstruction/quantization but not prediction. I am unable to find a good balance.

Was the weighting of the losses modulated during training? What loss weights were optimal for training?

li-li-github · 2023-03-02T08:50:30Z

As far as I remember, using equal loss weight for all losses did give meaningful outputs with OpenCell data. A tradeoff was observed between fc loss and reconstruction loss. When the weight for fc loss is high, the reconstruction gets poor quality and vice versa. Looking at my old code, I have used fc_coeff=0.1 (while keeping the rest =1) when I want better reconstruction. When fc_coeff < 0.1, the clustering will start to degrade.
But this could be data dependent. For your own dataset, you might need to play around a bit to find a sweet spot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss weights for optimal training #16

Loss weights for optimal training #16

parthnatekar commented Mar 1, 2023 •

edited

li-li-github commented Mar 2, 2023

Loss weights for optimal training #16

Loss weights for optimal training #16

Comments

parthnatekar commented Mar 1, 2023 • edited

li-li-github commented Mar 2, 2023

parthnatekar commented Mar 1, 2023 •

edited