Issues about optimizing other parameters besides learning rate #7

nicozorza · 2018-05-16T20:42:11Z

I have emailed Luca Franceschi about some issues with this library, and he asked me to share it.
I've been working on a MLP and wanted to optimize the following parameters, but found some problems:

Keep probability of a dropout layer: Luca explained to me that this is not posible since it has some non differentiable points.
Regularization beta: We are using tf.nn.l2_loss, but can't optimize the beta.
AdamOptimizer: When we tried to use far.AdamOptimizer() for the inner optimizer the code started crashing. Apparently there are some undefined variables: _beta1_power and _beta2_power. I think this is an error in the library.
Until now we have been able to optimize only the learning rate. It would be great if there could be a list of the things you can and can't do with this library.

Best regards,
Nicolás Zorzano.

lucfra · 2018-05-17T13:54:50Z

Ciao Nicolas,

I've pushed an update that fixes the problems for AdamOptimizer. In newer versions of TensorFlow the protected variables _beta1_power and _beta2_power changed name; from there the error!

For the regularization parameter, could you please be more specific?
Something like this:

w = .. your variable ..
rho = far.get_hyperparameter('rho', -3.)
l2_loss = tf.exp(rho)*tf.nn.l2_loss(w)

should allow you to optimize rho, (the exp is to ensure positive values of the regularization coefficient).

For dropout, it is not a totally trivial problem, and may be a topic of research. Anyway (I did not mention it in the email) under some assumptions, dropout can approximately be replaced by multiplicative Gaussian noise: see http://proceedings.mlr.press/v28/wang13a.pdf . This might suggest treating the variance of the noise as an hyperparameter, that could be optimized by g.d. with this package.

As soon as I have time, I will add an IPython book with a list of things that you can and cannot do, as you suggest!

Cheers,

Luca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues about optimizing other parameters besides learning rate #7

Issues about optimizing other parameters besides learning rate #7

nicozorza commented May 16, 2018

lucfra commented May 17, 2018 •

edited

Issues about optimizing other parameters besides learning rate #7

Issues about optimizing other parameters besides learning rate #7

Comments

nicozorza commented May 16, 2018

lucfra commented May 17, 2018 • edited

lucfra commented May 17, 2018 •

edited