Can we use Adam or other optimizer instead of SGD to train the network? #6

ShadowLau · 2018-08-29T08:02:02Z

Hi, I use swa to train my network recently, and the task is Re-ID. But I can not see obvious improvement (actually, almost the same w/o swa) when training network with Adam.

So, can we use Adam or other optimizer instead of SGD to train the networks, if we want to improve our networks with swa?

izmailovpavel · 2018-10-03T22:03:15Z

Hi, sorry for delayed response. In my experience SWA works best with SGD. Adam sets the learning rates adaptively, which is not ideal for SWA. However, we did see some improvement with other optimizers as well. I recommend trying to tune the learning rate schedule (try increasing the learning rates during the SWA stage), or maybe switching to SGD for the SWA stage.

mrgloom · 2020-10-30T11:41:01Z

As I see in TensorFlow Adam have trainable parameters, so the question is should we exclude these parameters from averaging? Same question for BN trainable parameters.

izmailovpavel · 2020-11-07T05:07:15Z

Hey @mrgloom. The adam parameters and BN parameters are not trainable parameters of the network. In fact, the former are tensors stored in the optimizer state, and the latter are buffers of the model. They should not be averaged. However, you need to fix the batchnorm statistics for the SWA model in the end of training (https://pytorch.org/blog/stochastic-weight-averaging-in-pytorch/#batch-normalization)

izmailovpavel closed this as completed Oct 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we use Adam or other optimizer instead of SGD to train the network? #6

Can we use Adam or other optimizer instead of SGD to train the network? #6

ShadowLau commented Aug 29, 2018

izmailovpavel commented Oct 3, 2018

mrgloom commented Oct 30, 2020

izmailovpavel commented Nov 7, 2020

Can we use Adam or other optimizer instead of SGD to train the network? #6

Can we use Adam or other optimizer instead of SGD to train the network? #6

Comments

ShadowLau commented Aug 29, 2018

izmailovpavel commented Oct 3, 2018

mrgloom commented Oct 30, 2020

izmailovpavel commented Nov 7, 2020