Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hager-Zhang converges incorrectly within Optim.ConjugateGradient due to flatness check #175

Open
kbarros opened this issue May 13, 2024 · 2 comments

Comments

@kbarros
Copy link

kbarros commented May 13, 2024

I found a case where the ConjugateGradient method of Optim.jl terminates early, at a non-stationary point, without raising an error. After some digging, the problem seems associated with LineSearches. I can't tell whether the actual bug is in in LineSearches, or in the way that Optim calls the Hager-Zhang line search. Your help in diagnosing this would be greatly appreciated.

Below is a simplified example illustrating how Hager-Zhang fails to converge correctly, given the (effective) parameters that are provided by Optim.jl:

using LineSearches

ϕ(c) = 3.042968312396456 - 832.4270136930788*c - 132807.15591801773*c^2 + 7.915421661743959e6*c^3 - 1.570284840040962e8*c^4 + 1.4221708747294645e9*c^5 - 5.970065591205604e9*c^6 + 9.405512899903618e9*c^7
(c) = -832.4270136930788 - 265614.31183603546*c + 2.3746264985231876e7*c^2 - 6.281139360163848e8*c^3 + 7.110854373647323e9*c^4 - 3.582039354723362e10*c^5 + 6.5838590299325325e10*c^6

function ϕdϕ(c)
    println("ϕ($c)=$(ϕ(c)), dϕ($c)=$((c))")
    (ϕ(c), (c))
end

c0 = 0.2
ϕ0, dϕ0 = ϕdϕ(0)
println("  ^ Is it valid to evaluate these away from c0=0.2?")

ls = HagerZhang()
res = ls(ϕ, dϕ, ϕdϕ, c0, ϕ0, dϕ0)

The printed output of this code is:

ϕ(0)=3.042968312396456, dϕ(0)=-832.4270136930788
  ^ Is it valid to evaluate these away from c0=0.2?
ϕ(0.2)=3.117411287254072, dϕ(0.2)=-505.33622492291033
ϕ(0.1)=-3.503584823341612, dϕ(0.1)=674.947830358913
ϕ(0.055223623837016025)=0.5244246783084083, dϕ(0.055223623837016025)=738.3388472434362

Note that the final value c=0.055... is not a zero point of the derivative .

The HZ line search terminates early because this "flatness" condition is hit: https://github.com/JuliaNLSolvers/LineSearches.jl/blob/master/src/hagerzhang.jl#L282-L285

However, it's not clear to me whether the underlying problem is the HZ flatness condition, or the way that the call to method.linesearch! is made in Optim: https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/utilities/perform_linesearch.jl#L41-L60).

Is it OK that the values of phi_0, dphi_0 were calculated for alpha == 0 and not state.alpha == 0.2?

To fix the bug, therefore, it seems that there are two possibilities:

  1. The HZ flatness condition needs to be modified.
  2. In Optim, prior to calling method.linesearch!, the values of phi_0, dphi_0 should be recalculated for the newly guessed state.alpha.

Thanks in advance.

@pkofod
Copy link
Member

pkofod commented May 14, 2024

See also #174

cc @mateuszbaran seems like it's another problem related to the flatness detection

@kbarros
Copy link
Author

kbarros commented May 14, 2024

Thanks, this looks very similar, I will move the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants