Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrected, improved and cleaned BackTracking #172

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

AlexandreMagueresse
Copy link

@AlexandreMagueresse AlexandreMagueresse commented Nov 13, 2023

There is a small mistake in BackTracking.jl when computing α_tmp with the cubic interpolation: when the leading coefficient is close to zero, the expression of α_tmp misses a negative sign, and in the other case when the leading coefficient is not zero, the expression of α_tmp is prone to numerical cancelling.

Indeed, the interpolating polynomial reads $P(X) \doteq a X^3 + b X^2 + \phi'(0) X + \phi(0)$ for some coefficients $a$ and $b$ that are the solution to a linear system obtained from the conditions $P(\alpha_1) = \phi(\alpha_1)$ and $P(\alpha_2) = \phi(\alpha_2)$. The extrema of $P$ are reached when its derivative is zero, that is $P'(X) = 3 a X^2 + 2 b X + \phi'(0) = 0$.

  • If $a$ is equal or close to zero, the root of $P'$ is $$\alpha = -\dfrac{\phi'(0)}{2b}.$$ The negative sign is missing in the current code.
  • Otherwise, following Nocedal & Wright, the next candidate is given by $$\alpha = \dfrac{-b + \sqrt{b^2 - 3 \phi'(0) a}}{3 a}.$$ However, this formula is prone to numerical cancelling, so it should be safer to multiply by the conjugate quantity and rewrite $\alpha$ as $$\alpha = -\dfrac{\phi'(0)}{b + \sqrt{b^2 - 3 \phi'(0) a}}.$$ This expression can also be used when $a = 0$.

I also made some edits for clarity and performance:

  • The constructor now checks all the parameters and sets appropriate default values.
  • The assertion order in (2, 3) at l. 44 is now made in the constructor of BackTracking rather than in the line search function.
  • The initial step size is α_0, consistently with α_1 and α_2, and with the other implementations of (ls::BackTracking)(...). The variables ϕx_0 and ϕx_1 are now ϕ_1 and ϕ_2 respectively in the whole file. This is more consistent with the notation ϕ_0.
  • The computation of powers of α_1 is slightly improved, and ϕx_0 - ϕ_0 - dϕ_0 * α_1 is only evaluated once. I made similar edits for α_2. These are very minor improvements.

@pkofod
Copy link
Member

pkofod commented Jan 16, 2024

Thank you for a thorough explanation and sorry for missing this in a very busy fall for me. I'll review it. The diff is a bit hard to follow because everything was done in one commit and a lot of lines change place (from line search time to constructor time), but I'll do my best :)

@pkofod
Copy link
Member

pkofod commented Jan 16, 2024

I'm sure this is not always easy, but maybe you can "work your way backwards" on an example to generate a test? I'm not going to put this as a hard requirement, but .. if you had an idea for a good test set. I know testing iterative methods can be difficult for corner cases, but maybe you could come up with a case where it hit it in the first iteration?

@AlexandreMagueresse
Copy link
Author

Hi @pkofod and thank you for taking a look at my pull request. I have been trying to come up with an example in which the current implementation and my revision of BackTracking differ. Here is one example I found:

using LineSearches

ϕ(x) = -sin(x) / x
dϕ(x) = -(cos(x) * x - sin(x)) / x^2
ϕdϕ(x) = ϕ(x), dϕ(x)

α0 = 1.0f5
x0 = -3.0
ϕ0, dϕ0 = ϕ(x0), dϕ(x0)

ls = BackTracking()
res = ls(ϕ, dϕ, ϕdϕ, α0, ϕ0, dϕ0)
println(ls, ": ", res)

The current implementation finds a step of 8.431871202806112 and the corresponding value is -0.09933943877214975 for $\phi$, while my revision finds a step of 2.1840602099471726 and the corresponding value is -0.3744282685822022.

@pkofod
Copy link
Member

pkofod commented Feb 2, 2024

Hi @AlexandreMagueresse ! I have not forgotten you. I will review this and @timholy 's other PR (you're also both welcome to give comments on each others' PR :) ) I just had a busy week and will have a busy weekend, but I promise to get back to you soon.

ρ_lo::TF
iterations::TI
order::TI
maxstep::TF
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the rest of the package uses 4-space indentation, this should too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. The last commit solves this issue.

Comment on lines 35 to 40
msg = """
The upper bound for the backtracking factor has to lie between
0 and 1.
Setting ρ_hi = $(ρ_hi).
"""
warn(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg = """
The upper bound for the backtracking factor has to lie between
0 and 1.
Setting ρ_hi = $(ρ_hi).
"""
warn(msg)
@warn """
The upper bound for the backtracking factor has to lie between
0 and 1.
Setting ρ_hi = $(ρ_hi).
"""

warn isn't a function anymore? Did you test this? With @warn you can use @test_logs.

Same issue below. These really need tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have indeed not tested these checks. We could write some more tests to cover them. I replaced warn with @warn in my last commit.


# Evaluate f(x) at proposed position
ϕx_0, ϕx_1 = ϕx_1, ϕ(α_2)
# Ensure termination
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think this will cause the whole optimize call to fail with this error in Optim.jl, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. This is already what happens in the current implementation. I just moved current lines 77-81 at the end of the loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants