I1083 l bfgs optimiser #1149

alisterde · 2020-05-24T17:36:59Z

I’ve spent the last 2 and a bit weeks looking at the BFGS and L-BFGS/ LM-BFGS algorithms for issue #1083 . The L-BFGS part is implement - if you you use a very large value of 'm' (the number of correction vectors for the inverse hessian stored) it's the same as the BFGS. I would appreciate it if you could check if I've got the Hessian update correct. There are two implementations one using scipy line search algorithms (BFGS_scipy) and another using the Hager-Zhang algorithm.

The optimiser isn't working yet, and I'm fairly sure the problem is with the line search, I’m having a lot of trouble with the line search component, as the scipy line search fails to find an acceptable step size form the very first step for the toy logistic model and the Hager-Zhang doesn't find an acceptable size either. Though as both of them have this issue it maybe the initialisation of the first inverse hessian and newton direction.

I’ve tried writing up the Hanger -Zang algorithm slightly adjusted for ask tell, I can’t quite see where I’m going wrong. I’ve checked it against an implementation in tensor flow (https://github.com/tensorflow/probability/blob/v0.10.0/tensorflow_probability/python/optimizer/linesearch/hager_zhang.py) and they seem to be preforming the same function, at the same time, in the same order as me. Stan and scipy don’t use this algorithm, they use an older method called the More-Thuente line search. The Hanger-Zhang algorithm has been shown to preform better.

I’ve also tried using the line search algorithms in scipy, but I'm having issues. Scipy doesn’t use them for their lm-bfgs - they wrap FORTRAN code. Also very few people actually seem to have implemented the BFGS alogrithms most wrap FORTRAN libraries.

Anyway, I’d appreciate it if you could check the BFGS part of what I’ve written (it’s in the tell() function of either implementation, though BFGS_scipy is clearer, and updates the hessian). Also please let me know if you have any ideas about line search.

When this is done I was thinking the majority of code would go in the abstract class LineSearchBasedOptimiser, as the only difference between BFGS, DFP, and conjugate gradient optimisation is the approximation of the inverse hessian.

I’m not going to be working on this for the next few weeks as I need to write up my project report (I have been working too much on pints have neglected it).

Best,
Alister

codecov · 2020-05-24T17:39:48Z

Codecov Report

Merging #1149 (4369b39) into master (a06ab88) will decrease coverage by 3.13%.
The diff coverage is 11.28%.

@@             Coverage Diff             @@
##            master    #1149      +/-   ##
===========================================
- Coverage   100.00%   96.86%   -3.14%     
===========================================
  Files           71       84      +13     
  Lines         7543     9050    +1507     
===========================================
+ Hits          7543     8766    +1223     
- Misses           0      284     +284

Impacted Files	Coverage Δ
pints/_optimisers/__init__.py	`60.12% <8.79%> (-39.88%)`	⬇️
pints/_optimisers/_lbfgs.py	`24.44% <24.44%> (ø)`
pints/__init__.py	`100.00% <100.00%> (ø)`
pints/_util.py	`100.00% <0.00%> (ø)`
pints/noise.py	`100.00% <0.00%> (ø)`
pints/_logger.py	`100.00% <0.00%> (ø)`
pints/_log_pdfs.py	`100.00% <0.00%> (ø)`
pints/_log_priors.py	`100.00% <0.00%> (ø)`
pints/_diagnostics.py	`100.00% <0.00%> (ø)`
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a06ab88...4369b39. Read the comment docs.

alisterde · 2020-05-24T18:05:18Z

I've made a rookie mistake and used pythons 3 feature of specifying what type a variable should be in a function, Python 2 doesn't support it I'll fix before I next commit. And Docs are failing as I haven't included the added files and functions in the rst files yet (not sure which implementation we will eventually keep).

MichaelClerx · 2020-05-26T15:38:39Z

Thanks @alisterde ! Looks like a huge amount of work has already gone into this.
I'll try to carve out some time to look through this in detail, within the next few weeks. So it's quite ideal that you'll be busy with something else, from my point of view 😅

One thing that struck me when you described the hard to find bug: Could this be an issue of scale or tuning? Many methods don't seem to work at all unless you tune them right, e.g. by picking the correct step sizes. If you were testing on the logistic problem you may also have noticed that it's got massive differences in the parameter magnitudes, which can throw some methods; so best to start on e.g. an almost symmetrical parabola in 2d param space.

MichaelClerx · 2020-06-02T09:20:37Z

Wondering if we should split the line search parts off into a separate PR?

MichaelClerx · 2020-06-02T09:33:53Z

pints/_optimisers/__init__.py

@@ -279,6 +279,29 @@ def set_hyper_parameters(self, x):
        self.set_population_size(x[0])


+class LineSearchBasedOptimiser(Optimiser):


I'm not 100% sure this is the best way to do it yet, but let's leave all that till after we have a working prototype!

pints/_optimisers/_bfgs_linesearch.py

MichaelClerx · 2020-06-02T09:42:38Z

This looks like it could be useful http://www.caam.rice.edu/~yzhang/caam554/pdf/cgsurvey.pdf

alisterde · 2020-06-02T11:11:08Z

Thanks @MichaelClerx and @ben18785,
It could definitely be an issue of scale or tuning I hadn't thought of that I will start by trying this!

MichaelClerx · 2020-11-03T09:34:47Z

Hi @alisterde ! I'd like to pick this up again at some point, as I think it's a really good basis for further optimisation work :-)
Is this PR up to date with the latest stuff you did, or are there still some uncommitted things you've worked on?

alisterde · 2020-11-03T11:32:55Z

Hi Michael, I think there might be some uncommitted changes, I'll check and get back to you. I am/was intending to pick this up again as well! Best, Alister

…

________________________________ From: Michael Clerx <notifications@github.com> Sent: 03 November 2020 09:35 To: pints-team/pints <pints@noreply.github.com> Cc: Alister Dale-Evans <alister.dale-evans@linacre.ox.ac.uk>; Mention <mention@noreply.github.com> Subject: Re: [pints-team/pints] I1083 l bfgs optimiser (#1149) Hi @alisterde<https://github.com/alisterde> ! I'd like to pick this up again at some point, as I think it's a really good basis for further optimisation work :-) Is this PR up to date with the latest stuff you did, or are there still some uncommitted things you've worked on? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#1149 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANR4JGTWBXHC5RJQZYBSFGTSN7FEPANCNFSM4NI7HSOA>.

…m/pints into i1083_l-bfgs_optimiser

alisterde · 2020-11-08T17:15:39Z

Hi @MichaelClerx,

I've taken another look at this and done the following:

Split the line search and LBFGS optimiser so that the line search can be used with other gradient based optimisers.
Addressed the testing areas cased by using some Python 3 only features
Added a basic example notebook
Added a stopping criteria so that any line search based algorithm will terminate when the gradients at the best fit point are equal to zero (<1e-6).

The line search algorithm needs improvement to better fit into the ask/tell framework.
I've left a print statement in that tells the user when the Hessian has been updated otherwise it looks very like the algorithm is stuck when it isn't, I would like to know you're thoughts on this.
There's a large commented out function at the bottom of the line search class called __secant2 this function was moved to be within ask and pass to tell at key points however I've left it for you to see as I think it's easier to understand written in this function, but I don't mind getting rid of it.

I have not added tests yet.
The implementation seems quite slow.
It will probable benefit from the transformation features in the most resent pints.

Tests are all passing in my per-commit checks but appear to be failing on committing due to an import problem for the 'curve_fit' functions in Python 2.7, I haven't been able to work put how to fix this.

Let me know how you want to proceed,
Alister

alisterde added 2 commits May 24, 2020 17:41

abstract class LineSearchBasedOptimiser

7e36b83

🚧 lbfgs algorithm

ab50fa8

alisterde added 3 commits May 24, 2020 18:49

Removing none ASCII characters

a350cb6

🔥

6d71df4

removing non ASCII characters

635acc3

MichaelClerx reviewed Jun 2, 2020

View reviewed changes

pints/_optimisers/_bfgs_linesearch.py Outdated Show resolved Hide resolved

alisterde added 3 commits July 16, 2020 12:31

LBFGS ready for sepration into abstract class

7ea50f4

deleting bfgs_scipy

3985888

seperating linsearch

ec2b7c7

MichaelClerx assigned alisterde Nov 3, 2020

alisterde and others added 9 commits November 8, 2020 15:30

renaming bfgs to lbfgs

ef4de15

splitting optimser and linesearch

e1b3585

documentation related to LBFGS and linesearch

3c0244b

Delete bfgs_trial.ipynb

009e7be

Merge branch 'i1083_l-bfgs_optimiser' of https://github.com/pints-tea…

1d94914

…m/pints into i1083_l-bfgs_optimiser

Adding convergence stopping crtirea to lbfgs

683a2a5

lbfgs example (needs improving)

858857f

style fix

10c8f35

Adding lbfgs notebook to documentation

4369b39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I1083 l bfgs optimiser #1149

I1083 l bfgs optimiser #1149

alisterde commented May 24, 2020 •

edited by MichaelClerx

codecov bot commented May 24, 2020 •

edited

alisterde commented May 24, 2020

MichaelClerx commented May 26, 2020

MichaelClerx commented Jun 2, 2020

MichaelClerx Jun 2, 2020

MichaelClerx commented Jun 2, 2020

alisterde commented Jun 2, 2020

MichaelClerx commented Nov 3, 2020

alisterde commented Nov 3, 2020 via email

alisterde commented Nov 8, 2020

		@@ -279,6 +279,29 @@ def set_hyper_parameters(self, x):
		self.set_population_size(x[0])


		class LineSearchBasedOptimiser(Optimiser):

I1083 l bfgs optimiser #1149

Are you sure you want to change the base?

I1083 l bfgs optimiser #1149

Conversation

alisterde commented May 24, 2020 • edited by MichaelClerx

codecov bot commented May 24, 2020 • edited

Codecov Report

alisterde commented May 24, 2020

MichaelClerx commented May 26, 2020

MichaelClerx commented Jun 2, 2020

MichaelClerx Jun 2, 2020

Choose a reason for hiding this comment

MichaelClerx commented Jun 2, 2020

alisterde commented Jun 2, 2020

MichaelClerx commented Nov 3, 2020

alisterde commented Nov 3, 2020 via email

alisterde commented Nov 8, 2020

alisterde commented May 24, 2020 •

edited by MichaelClerx

codecov bot commented May 24, 2020 •

edited