You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pytorch optimizers include a maximize flag (Pytorch Issue)*. When set to True, the sign of gradients is flipped inside optimizer.step() before computing parameter updates. This enables gradient ascent steps natively.
NOTE: This sign flip does not affect the parameter's .grad attribute.
Cooper currently populates the gradients of dual variables with their negative value, so that descent steps performed by the dual optimizer are in fact ascent steps towards optimizing the problem formulation.
# Flip gradients for multipliers to perform ascent.
# We only do the flipping *right before* applying the optimizer step to
# avoid accidental double sign flips.
formultiplierinself.formulation.state():
ifmultiplierisnotNone:
multiplier.grad.mul_(-1.0)
We should not do the sign flipping manually but rather force set maximize=True when instantiating the dual optimizer.
* This has been implemented on on Pytorch's master branch for every optimizer but LBFGS . On v1.12, Adam, SGD and AdaGrad support the flag, but not RMSProp. An assert could be included to ensure that the requested dual optimizer supports the flag.
⚠ This change would break compatibility with versions of Pytorch prior to 1.12.
Motivation
Manually flipping gradients immediately after calculating them (thus ensuring that this happens before calls to dual_optimizer.step()) is error prone.
Moreover, keeping track of the fact that gradients have a sign flipped is inconvenient.
By implementing this change we would adopt the official Pytorch approach for performing ascent steps.
Enhancement
Pytorch optimizers include a
maximize
flag (Pytorch Issue)*. When set toTrue
, the sign of gradients is flipped insideoptimizer.step()
before computing parameter updates. This enables gradient ascent steps natively.Cooper currently populates the gradients of dual variables with their negative value, so that descent steps performed by the dual optimizer are in fact ascent steps towards optimizing the problem formulation.
cooper/cooper/constrained_optimizer.py
Lines 401 to 406 in 09df759
We should not do the sign flipping manually but rather force set
maximize=True
when instantiating the dual optimizer.* This has been implemented on on Pytorch's master branch for every optimizer but LBFGS . On v1.12, Adam, SGD and AdaGrad support the flag, but not RMSProp. An assert could be included to ensure that the requested dual optimizer supports the flag.
⚠ This change would break compatibility with versions of Pytorch prior to 1.12.
Motivation
Manually flipping gradients immediately after calculating them (thus ensuring that this happens before calls to
dual_optimizer.step()
) is error prone.Moreover, keeping track of the fact that gradients have a sign flipped is inconvenient.
By implementing this change we would adopt the official Pytorch approach for performing ascent steps.
Alternatives
The current implementation is functional.
References
maximize
flag to all optimizers pytorch/pytorch#68052The text was updated successfully, but these errors were encountered: