Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lagrange Multipliers for constraints with any minimizer #238

Open
wants to merge 1 commit into
base: fix_214
Choose a base branch
from

Conversation

tBuLi
Copy link
Owner

@tBuLi tBuLi commented May 14, 2019

Goal of this PR is to add the option for Lagrange multipliers to Fit. The basic API is as follows:

x, y, l = parameters('x, y, l')
f, = variables('f')

model = Model({f: x + y})
constraints = {
    l: Eq(x ** 2 + y ** 2, 1),
}

fit = Fit(model, constraints=constraints)

When the constraints are provided as a dict, the keys are interpreted as Lagrange multipliers (parameters), and the values as the constraints in the usual way.

In the background, fit will then determine the correct objective for f or will use the one provided by the user, and will then build the gradient of Lagrangian function L, see the wiki page. (For the example above, the objective will be MinimizeModel, but in scenarios with data it will be LeastSquares instead.)

We then fit when the gradient of the lagrangian function equals zero, instead of minimizing L directly. This is because in general L is not well behaved, and may have some stationary values but then drop off to -inf so minimization does not find the desired minimum but finds -inf instead.

This means we can introduce a new keyword to Fit: stationary_point. By default this is false, meaning we minimize the objective. For any model, setting stationary_point=True will find when the gradient of the objective is zero instead. When providing the constraints as a dict, stationary_point will be forced to True. This keyword seems to be beneficial in general, which is why I choose to expose it. See the Mexican hat test in this PR.

To Do:

  • Add stationary_point keyword to Fit.
  • Build Lagrangian.
  • Add technical note to explain the content above for future reference
  • Write documentation
  • Include second derivative test of the lagrangian in FitResults when stationary_point=True, so we know the nature of the stationary point found. (This is done by checking the signs of the eigenvalues of the Hessian of the Lagrangian.)
  • Briefly investigate again the extension to include inequality constraints, but probable that will have to be a future PR. The problem with this generalization is that many more tests have to be performed before we know the stationary point is a valid one.

…general, but an absolute must in order to support Lagrange multipliers.
@tBuLi tBuLi changed the base branch from master to fix_214 May 14, 2019 08:51
@Jhsmit
Copy link
Collaborator

Jhsmit commented May 14, 2019

Are you getting the stew or the classic vol-au-vent?

@tBuLi tBuLi changed the title Lagrange mult objective Lagrange Multipliers for constraints with any minimizer May 14, 2019
@tBuLi
Copy link
Owner Author

tBuLi commented May 14, 2019

You guys had stew or vol-au-vent today? I had a hamburger with spinach, Popeye would approve.

@pckroon pckroon added the WIP Work in Progress label May 14, 2019
Copy link
Collaborator

@pckroon pckroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start.
I wonder it's guaranteed to find a stationary point, and how come it's guaranteed that where nabla(L) = 0 the solution is feasible, but I guess that just means I should read up on my maths.

My main concern is that you construct the lagrangian as a CallableNumericalModel, while I don't understand why it can't be a normal (analytical) Model.

:param bool absolute_sigma: True by default. If the sigma is only used
for relative weights in your problem, you could consider setting it
to False, but if your sigma are measurement errors, keep it at True.
Note that curve_fit has this set to False by default, which is
wrong in experimental science.
:param bool stationary_point: If ``True``, solve for a stationary value
of the objective, i.e. when the jacobian of the objective equals
zero, instead of minimizing the model itself. The default setting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model or objective

@@ -1274,6 +1282,13 @@ def __init__(self, model, *ordered_data, **named_data):
else:
self.model = Model(model)

if isinstance(constraints, dict):
# Split in multipliers and constraints, respecting order.
ordered_constraints = OrderedDict(**constraints)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don;t think you need to unpack a dict to turn it into an ordereddict.

:param multipliers: list of :class:`symfit.core.argument.Parameter`
objects, to be interpreted as Lagrange multipliers :math:`\\lambda_j`.
:param constraint_objectives: list of objectives representing the
constraints :math:`g_{j}(\\vec{p})`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just keep these two (multipliers, constraint_objectives) as dict for the argument here?

model_dict = {grad_f: objective.eval_jacobian}
model_dict.update(
{grad_g_i: constraint_objectives[i].eval_jacobian for i, grad_g_i in
enumerate(grad_gs)})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closing bracket ) on the next line

{grad_g_i: constraint_objectives[i].eval_jacobian for i, grad_g_i in
enumerate(grad_gs)})
model_dict[grad_L] = grad_f + sum(
l_i * grad_g_i for l_i, grad_g_i in zip(multipliers, grad_gs))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

connectivity_mapping = {grad_f: set(objective.model.free_params)}
connectivity_mapping.update(
{grad_g_i: set(objective.model.free_params) for grad_g_i in
grad_gs})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

grad_gs})
connectivity_mapping.update({grad_L: {grad_f, *grad_gs}})

grad_lagrangian = CallableNumericalModel(model_dict,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why numerical? Don't you (also) have an analytical form of the lagrangian?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a key issue I'm having with this PR: this is indeed defined best for problems which have an analytical form. However, I'm encountering two problems with this. Apart from the obvious problem that not every model is analytical, the bigger issue is that something like our LeastSquared solver is not analytical, but the lagrange multiplier should be added to this LeastSquare, not to the model. That's a serious problem.

One solution that would keep this code very clean might be to say that this is only valid for cases where we need to MinimizeModel instead, since there adding the multipliers is trivial. But that would mean that at least for now it will not work for least squares fitting with data, unless the user writes out the least squares function symbolically, which is completely fair game and would work. But I don't want to do that out of the box because we cannot know how many dimension we might have to sum over etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for a first implementation it would be acceptable to limit it to MinimizeModel. We'll need to think about what to do with the other Objectives.

ydata = model(x=xdata, a=2, b=3).y

# Because this model falls of the infinity, a normal minimizer wont work
# This is especially true when bounds are provided.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is not correct for this test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Work in Progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants