Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "xtol" and "ftol" stopping criteria for optimisers #1502

Open
MichaelClerx opened this issue Oct 31, 2023 · 11 comments · May be fixed by #1508
Open

Add "xtol" and "ftol" stopping criteria for optimisers #1502

MichaelClerx opened this issue Oct 31, 2023 · 11 comments · May be fixed by #1508
Assignees
Labels

Comments

@MichaelClerx
Copy link
Member

MichaelClerx commented Oct 31, 2023

Many implemented optimisers provide stopping criteria based on the difference between the last two parameter sets (abs(x[i] - x[i - 1]) or the last two function evaluations (abs(f[i] - f[i - 1])). When comparing methods from PINTS with other methods, it is useful to have something similar.

How do people define xtol and ftol?

Absolute versus relative

Occasionally, relative differences are used, e.g. abs(f[i] - f[i - 1]) / f[i].

Scalar xtol?

In nlopt, xtol can be a scalar, in matlab it looks like it needs to have size n_parameters.

AND vs OR

In scipy's Nelder-mead, xtol AND ftol must be met. Everywhere else they are independent stopping criteria https://stackoverflow.com/questions/43644797/what-is-xtol-for-in-minimizemethod-nelder-mead

Proposal

  1. Add xtol and ftol as separate, independent criteria (so OR, not AND)
  2. They will not be set by default, so users can enable them (or disable them) for methods and problems where they think this is appropriate
  3. Adding relative versions means doubling the workload, and dealing with the case where x or f has 0 at its optimum. The only benefit is that the user can write set_xtol(generic_tol) instead of set_xtol(abs_tol / sensible_scaling_value). I propose we do absolute only, leaving scaling to the user (who has the info needed for this decision)
  4. Similarly, allowing a scalar xtol is more implementation/documentation/tests/maintenance/room-for-bugs, all for the sake of a user writing set_xtol(tol) instead of set_xtol([tol] * n_parameters). So xtol always n_parameters, ftol always scalar.

Both matlab and scipy seem to be phasing out the "xtol"/"ftol" terminology (i.e. scipy is using xatol/ratol when relative, matlab is saying "StepTolerance"/"FunctionTolerance", so might follow that and

  1. Call the methods set_step_tolerance and set_function_tolerance. The current names are max_iterations, max_evaluations, max_unchanged_iterations, and threshold (stop if f < threshold). So could have set_step_tolerance and set_tolerance instead? Or set_min_step and set_min_change ?
@MichaelClerx MichaelClerx changed the title Add "xtol" and "ftol Add "xtol" and "ftol" stopping criteria for optimisers Oct 31, 2023
@MichaelClerx MichaelClerx self-assigned this Oct 31, 2023
@mirams
Copy link
Member

mirams commented Oct 31, 2023

I'm happy with those proposals, doing absolute ones makes more sense as relative would presumably be with respect to initial guess or something which would be a bit strange to vary run to run. Haven't ever seen a package specify a vector of xtol, but it makes sense, maybe I have just been automatically doing [tol]*n_params behind the scenes, but a nice friendly message saying "the xtol should be a vector of length n_parameters specifying separate tolerances for each parameter" would make it easy enough to see what needs to be done.

I think matlab's optimset (common optimisation options object used by a range of optimisers) is the thing to look at, that uses 'TolX' and 'TolFun' nowadays: optimset docs
image

Although, the matlab optimisation toolbox itself has a different optimoptions which gets the options for a particular optimiser, they seem to more commonly use ObjectiveLimit (for "got to top of Ben Nevis" stopping), OptimalityTolerance or FunctionTolerance for TolF and StepTolerance for XTol, presumably based on how far in parameter space it moved (might even be a Euclidean distance or something?):

                MaxIterations: 1000
               ObjectiveLimit: -1.0000e+20
          OptimalityTolerance: 1.0000e-06
                    OutputFcn: []
                      PlotFcn: []
                 ScaleProblem: 0
    SpecifyConstraintGradient: 0
     SpecifyObjectiveGradient: 0
                StepTolerance: 1.0000e-10

@MichaelClerx
Copy link
Member Author

OK, so with our threshold then we can match all 3. That's good!

@MichaelClerx
Copy link
Member Author

@mirams @martinjrobins @chonlei discussion point: The ftol is essentially the same as set_max_unchanged_iterations with iterations=1.

Options:

  1. Don't implement a seperate ftol criterion
  2. Implement a separate ftol criterion
  3. Implement a method set_function_tolerance that internally calls set_max_unchanged_iterations (and document that this is what it does)

Input from anyone else doing optimisation appreciated!

@MichaelClerx
Copy link
Member Author

There is a case for having both I think, if you want to use the ftol as a convergence test for something that is expected to be asymptotic at the end but keep that max-unchanged to detect being stuck

@chonlei
Copy link
Member

chonlei commented Nov 14, 2023

I am not too sure. For option 3, it is like having an alias to set_max_unchanged_iterations with argument iterations=1, right? If it is what it is, then might be the easiest workaround?

@mirams
Copy link
Member

mirams commented Nov 14, 2023

Yeah, an argument for just wrapping, so ftol(tol) = set_max_unchanged_iterations(iterations=1, tol). I guess the problem there is that it somewhat needs all the optimisers to agree what an 'iteration' means. I had imagined that you'd want to be usually wrapping the ftol() function provided by off-the-shelf optimisers, and I imagine different optimisers can and will define "over how many function evals/iterations/whatever" depending on how they work and are known to converge. Whereas I guess the max_changed_iterations is something we're implementing ourselves?

I think confusion could then arise if the user can call both, but perhaps not, might even be a good idea to be able to set a collection of these like

set_ftol(1e-7) // let the optimiser decide how to implement
set_max_unchanged_iterations(iterations=1, tol=1e-12)
set_max_unchanged_iterations(iterations=10, tol=1e-10)
set_max_unchanged_iterations(iterations=100, tol=1e-8)

and stop the first time one is met!! But it is a bit crazy.

@MichaelClerx
Copy link
Member Author

MichaelClerx commented Nov 14, 2023

I guess the problem there is that it somewhat needs all the optimisers to agree what an 'iteration' means. I had imagined that you'd want to be usually wrapping the ftol() function provided by off-the-shelf optimisers, and I imagine different optimisers can and will define "over how many function evals/iterations/whatever" depending on how they work and are known to converge

No we ignore what the optimisers say so that we can use the same controller to run optimisations with them! They use different number of evaluations per iteration, that's fine :D


To clarify the "who implements what" thing a bit further:

  1. All optimisers in PINTS use an ask-and-tell interface, [one call to ask() then tell()] = one iteration. For PINTS we don't care if that's unfair for comparisons as that's not the goal, the goal is just to make a few useful ones available.
  2. The OptimisationController implements functionality shared by optimisers, so that the Optimiser classes themselves can stay clean. This includes optimisation of stopping criteria such as max unchanged iterations, max iterations, max evaluations, function threshold, and after this ticket also xtol (but perhaps not ftol).
  3. Users who want complicated stopping criteria can forgo the controller and use ask-and-tell on the Optimiser classes directly. This is our main mechanism for limiting the scope of PINTS and stop it from becoming totally unmanageable for the sake of obscure/advanced use cases 😄
  4. The ask-and-tell requirement plus the goal of staying pure python means we can't wrap much. When we do wrap (only CMAES at the moment), we disable as many of the optimiser's own criteria, so that the user is presented with a single, unified interface for setting stopping criteria.
  5. However, the Optimisation class has a method stop() https://pints.readthedocs.io/en/stable/optimisers/base_classes.html#pints.Optimiser.stop that it can use to tell the Controller when it's stuck (e.g. an ill-conditioned matrix or some unrecoverable error). There is nothing stopping people from adding new options to an Optimiser class that would re-enable its native stopping criteria (you would then set this using controller.optimiser().set_something_or_other(...))

@MichaelClerx
Copy link
Member Author

might even be a good idea to be able to set a collection of these like

That's an interesting idea! Would require some laborious unsetting syntax though, e.g. unset_unchanged_iterations with the exact same arguments, or having set return some object you can then pass in to unset.

Will leave that for now as we don't have a direct use case :D

@MichaelClerx
Copy link
Member Author

But both leaning towards option 2 then?

@MichaelClerx MichaelClerx changed the title Add "xtol" and "ftol" stopping criteria for optimisers Add "xtol" ~and "ftol"~ stopping criteria for optimisers Nov 21, 2023
@MichaelClerx MichaelClerx changed the title Add "xtol" ~and "ftol"~ stopping criteria for optimisers Add "xtol" and "ftol" stopping criteria for optimisers Nov 21, 2023
@MichaelClerx
Copy link
Member Author

Alright, will add a max_unmoved_iterations then so it has the same syntax as max_unchanged_iterations.

@MichaelClerx
Copy link
Member Author

MichaelClerx commented Dec 5, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants