Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do maximize=True for dual_optimizers #49

Open
juan43ramirez opened this issue Aug 26, 2022 · 1 comment
Open

Do maximize=True for dual_optimizers #49

juan43ramirez opened this issue Aug 26, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@juan43ramirez
Copy link
Collaborator

Enhancement

Pytorch optimizers include a maximize flag (Pytorch Issue)*. When set to True, the sign of gradients is flipped inside optimizer.step() before computing parameter updates. This enables gradient ascent steps natively.

NOTE: This sign flip does not affect the parameter's .grad attribute.

Cooper currently populates the gradients of dual variables with their negative value, so that descent steps performed by the dual optimizer are in fact ascent steps towards optimizing the problem formulation.

# Flip gradients for multipliers to perform ascent.
# We only do the flipping *right before* applying the optimizer step to
# avoid accidental double sign flips.
for multiplier in self.formulation.state():
if multiplier is not None:
multiplier.grad.mul_(-1.0)

We should not do the sign flipping manually but rather force set maximize=True when instantiating the dual optimizer.

* This has been implemented on on Pytorch's master branch for every optimizer but LBFGS . On v1.12, Adam, SGD and AdaGrad support the flag, but not RMSProp. An assert could be included to ensure that the requested dual optimizer supports the flag.

⚠ This change would break compatibility with versions of Pytorch prior to 1.12.

Motivation

Manually flipping gradients immediately after calculating them (thus ensuring that this happens before calls to dual_optimizer.step()) is error prone.
Moreover, keeping track of the fact that gradients have a sign flipped is inconvenient.

By implementing this change we would adopt the official Pytorch approach for performing ascent steps.

Alternatives

The current implementation is functional.

References

@juan43ramirez juan43ramirez added the enhancement New feature or request label Aug 26, 2022
@juan43ramirez
Copy link
Collaborator Author

This would require changing the Extra optimizers to also have the flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant