Restart dual_optimizer state when performing dual restarts #28

juan43ramirez · 2022-06-07T23:43:33Z

Enhancement

When a dual restart is triggered, the dual variables are reset to their initial value 0.
Nonetheless, the state of the primal and dual optimizer remains the same. This may include running averages for momentum mechanisms.

These could be reset along with dual restarts when feasibility is achieved.

Motivation

This would represent a full reset of the optimization protocol when the constraint is being satisfied. Currently, the reset is "half baked" in the sense that only dual variables are reset.

References

Reset the state of a Pytorch optimizer: https://discuss.pytorch.org/t/reset-adaptive-optimizer-state/14654/5

gallego-posada · 2022-06-09T02:08:20Z

This is a good idea.

This would for sure be problematic for the dual variables as the momentum "accumulation" during periods of feasibility might prevent the multiplier from moving in the right direction if the constraint becomes violated later.

Not sure whether this is as "problematic" for the primal optimizer. Maybe we could enable a flag to also reset the state of the primal optimizer upon dual restarts, but not force it.

juan43ramirez · 2022-08-24T02:24:21Z

Perhaps we could maintain the primal optimizer's state.

What worries me is maintaining momentum "towards satisfying the constraints" that primal optimizers might have when reaching feasibility. Also, the running means have been accumulating possibly large values associated with $+ \lambda \nabla g$ (large $\lambda$ at satisfaction). These may mean biases in direction and aggressive decreases in magnitude for updates after restarts, which should only/mostly focus on the objective function.

That being said, (i) even if momentum and running means are slightly misleading, they have been computed (and will get updated) according to objective-heavy gradients and (ii) not sure if addressing these "issues" will have big practical implication.

gallego-posada · 2023-02-24T20:53:30Z

Modifying the state of the dual optimizers based on the feasibility of the constraint is challenging in general. It is manageable for optimizers like SGD with momentum, but could become very difficult for generic optimizers since the internal state might be"shared" across parameters. For example, an optimizer might keep track of the correlation in the gradient between different parameters.

The practical implications caused by this mis-alignment between the optimizer state and the reset value of the multiplier are unclear to me (and I guess they would depend on the type of optimizer).

For now I would suggest (1) simply performing the value reset, (2) leaving the optimizer state untouched and (3) documenting this pitfall explicitly in the Multiplier class.

juan43ramirez added the enhancement New feature or request label Jun 7, 2022

juan43ramirez self-assigned this Aug 24, 2022

juan43ramirez changed the title ~~Restart dual_optimizer (and perhaps primal optimizer) state when performing dual restarts~~ Restart dual_optimizer state when performing dual restarts Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restart dual_optimizer state when performing dual restarts #28

Restart dual_optimizer state when performing dual restarts #28

juan43ramirez commented Jun 7, 2022

gallego-posada commented Jun 9, 2022

juan43ramirez commented Aug 24, 2022

gallego-posada commented Feb 24, 2023

Restart dual_optimizer state when performing dual restarts #28

Restart dual_optimizer state when performing dual restarts #28

Comments

juan43ramirez commented Jun 7, 2022

Enhancement

Motivation

References

gallego-posada commented Jun 9, 2022

juan43ramirez commented Aug 24, 2022

gallego-posada commented Feb 24, 2023