Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easier auto-differentiation support? #128

Open
jlperla opened this issue Feb 28, 2019 · 12 comments
Open

Easier auto-differentiation support? #128

jlperla opened this issue Feb 28, 2019 · 12 comments

Comments

@jlperla
Copy link

jlperla commented Feb 28, 2019

Right now you need to manually setup auto-differentiation for functions which is more cumbersome than others (e.g. Optim or JuMP directly). Would you be willing to entertain a PR to make this a little easier to work with?

You can see https://github.com/ubcecon/computing_and_datascience/blob/master/julia_tutorials/nlopt/nlopt-tutorial.ipynb for an example of what it might look like. Basically, you could just call an adapter on functions and pass that object directly into your existing functions.

The downside is that this would require adding https://github.com/JuliaNLSolvers/NLSolversBase.jl as a dependency with the following sub-dependencies: https://github.com/JuliaNLSolvers/NLSolversBase.jl/blob/master/REQUIRE

@mlubin
Copy link
Member

mlubin commented Mar 1, 2019

Another idea to throw out there. We have an abstraction for NLP solvers in MathOptInterface with multiple backends, e.g., Ipopt and Knitro. (NLopt has a wrapper for the old MathProgBase API but not yet MathOptInterface.) If you target your adapters to MathOptInterface instead of NLopt directly then you get multiple backends without much extra cost.

@jlperla
Copy link
Author

jlperla commented Mar 1, 2019

@mlubin I really hope that NLopt gets up with the MOI soon so that I can swap out Ipopt when using juMP

Maybe I am misunderstanding, but if you are suggesting that given MOI we would instead use it directly for calling the optimizer, I am not sure I see it as it currently stands. The complexity of MOI is necessary to drive a EDSL like JuMP, but it is very heavy for those of us that just want to call a nonlinear solver and pass in arbitrary functions of vectors. Perhaps JuMP will become a little more friendly for those who want nonlinear solvers without a DSL, or perhaps there is another interface built on MOI for people like me?

Just so you know, the total code required to add Optim-style AD support to NLsolve is just

using NLSolversBase
struct NLoptAdapter{T} <: Function where T <: AbstractObjective
    nlsolver_base::T
end

# implement fg!; note that the order is reversed
(adapter::NLoptAdapter)(x, df) = adapter.nlsolver_base.fdf(df, x)
(adapter::NLoptAdapter)(result, x, jacobian_transpose) = adapter.nlsolver_base.fdf(result, jacobian_transpose', x)

# constructors
NLoptAdapter(f, x, autodiff = :forward) = NLoptAdapter(OnceDifferentiable(f, x, autodiff = autodiff))
NLoptAdapter(f!, x::Vector, F::Vector, autodiff = :forward) = NLoptAdapter(OnceDifferentiable(f!, x, F, autodiff = autodiff))

@stevengj
Copy link
Member

stevengj commented Mar 1, 2019

Something along these lines seems reasonable to me.

@jlperla
Copy link
Author

jlperla commented Mar 2, 2019

Great @arnavs will prepare a full PR for this with the necessary tests, documentation, etc.

At this point, we think that this will require no direct changes to the NLopt code itself (though you may have ideas of how it could be reorganized). By using NLSolversBase as a dependency, we may be able to hook in other AD implementations if they are added to that library (@arnavs may also try to add Flux.jl or Zygote.jl support to NLSolversBase in a separate issue).

(1) We don't add in exported wrapper type and instead add in arguments to the functions themselves? i.e. add a method shadowing the existing ones so it would look like:

# A setup from our existing tests
M = 5000
x0 = fill(1.0, M)
m = range(2, 5, length = M) 
f(x) = sum((x .- m).^2) + sum(log.(x))

# define the optimization problem
opt = Opt(:LD_LBFGS, M)
min_objective!(opt, f, autodiff=:forward)

The list of additional methods would be something like

min_objective!(opt::Opt, f::Function, autodiff)
max_objective!(opt::Opt, f::Function, autodiff)
inequality_constraint!(opt::Opt, fc, tol=0, autodiff)
equality_constraint!(opt::Opt, h, tol=0, autodiff)
inequality_constraint!(opt::Opt, c, tol::AbstractVector, autodiff)
equality_constraint!(opt::Opt, c, tol::AbstractVector, autodiff)

where autodiff would support the NLSolversBase symbols in https://github.com/JuliaNLSolvers/NLSolversBase.jl/blob/master/src/NLSolversBase.jl#L55
(i.e. :forward, :finite or :finiteforward, I think?).

(2) Alternatively, the less invasive one is to have a pubically exposed wrapper:

# A setup from our existing tests
M = 5000
x0 = fill(1.0, M)
m = range(2, 5, length = M) 
f(x) = sum((x .- m).^2) + sum(log.(x))

# define the optimization problem
opt = Opt(:LD_LBFGS, M)
f_opt = NLoptDifferentiable(f, x0) # The only new code!
min_objective!(opt, f_opt)

For the name of the adapter:

  • Optim.jl has some algorithms which use hessians, and hence distinguishing between OnceDifferentiable and TwiceDifferentiable is necessary. It didn't appear that the Julia interface supported any algorithms with hessians in the objective. Did I miss anything?
  • If not, then NLoptDifferentiable or even Differentiable would work (we don't want to collide with NLSolversBase.OnceDifferentiable

An alternative is that we don't add in exported wrapper type and instead add in arguments to the functions themselves? i.e. add a method


@stevengj If you have no strong opinions, I think we should first try to add it directly in the interfaces on those functions instead of wrappers? That would be the easiest for end-users.

@jlperla
Copy link
Author

jlperla commented Mar 2, 2019

@arnavs: A few notes:

  • If we try to make new methods for the min_objective!, etc. , we will need to deal with the fact that we cannot construct a NLSolversBase.OnceDifferentiable without having a vector of the right type to pass in. OnceDifferentiable uses it to create a cache, so it is the type that matters rather than the value within the vector
  • For this, I think you can query the opt::Opt for the dimension of the objective that was proposed and create a zeros(Float64...) to pass in. The tol::AbstractVector for the vector inequality and equality constraints should be useful for determining that Jacobian, and the univariate inequality and equality constraints should not be too difficult to figure out.
  • The code for creating the min_objective! and max_objective! is in this code: https://github.com/JuliaOpt/NLopt.jl/blob/master/src/NLopt.jl#L426 Which is basically a proxy to the ccall in the underlying library. You could create the min_objective! etc. which create and call these outside of this sort of fancy @eval and then just call these existing methods
  • For the unit tests, it is essential to try out all of the variations on the finite-differences and forward differences for the univariate objective, multivariate objective, univariate equality constraint, univariate inequality constraint, multivariate equality, and multivariate inequality. You can construct artifically simple optimization problems for those tests.

@mlubin
Copy link
Member

mlubin commented Mar 2, 2019

This thread is moving in the direction of adding direct support for AD to NLopt which is fine with me. But anyway to address the point on MOI,

Maybe I am misunderstanding, but if you are suggesting that given MOI we would instead use it directly for calling the optimizer, I am not sure I see it as it currently stands. The complexity of MOI is necessary to drive a EDSL like JuMP, but it is very heavy for those of us that just want to call a nonlinear solver and pass in arbitrary functions of vectors.

Yes, I was suggesting to call the optimizers via MOI. Yes, there are a few more lines of code required, but if you're already in the business of designing APIs for nonlinear optimization (as discussed in this thread), then I don't expect it to be a significant challenge. There are a number of people familiar with the MOI API who can answer questions at https://gitter.im/JuliaOpt/JuMP-dev. I also have an old demo notebook of connecting Newton's method to MathProgBase: https://nbviewer.jupyter.org/github/JuliaOpt/juliaopt-notebooks/blob/master/notebooks/MathProgBase%20Newton%20solver.ipynb. The API for interacting with derivatives is mostly unchanged from MPB to MOI.

The main extra things that MOI supports over the Optim/NLopt-type interfaces are:

  • Sparse hessian matrices
  • Separate handling of linear, quadratic, and other classes of constraints.

I understand that a lot of use cases for nonlinear optimization don't need to think about sparse hessian matrices or linear constraints. Ultimately it's up to you to decide if the extra complexity of the abstraction is worth it.

Perhaps JuMP will become a little more friendly for those who want nonlinear solvers without a DSL, or perhaps there is another interface built on MOI for people like me?

The JuliaSmoothOptimizers organization has built a good amount of infrastructure that could be useful. As far as I know it hasn't been updated for MOI yet though. In any case, the best way for a simpler NLP interface to be built on top of MOI is for someone invested in the result to write the code. It's a question of a couple days, not weeks or months.

@jlperla
Copy link
Author

jlperla commented Mar 3, 2019

but if you're already in the business of designing APIs for nonlinear optimization (as discussed in this thread),
That is close to the last thing that I want to do. My only goal here is to hack support into NLopt for AD to get through until there are interfaces that people like me want to use. To me, the gold standard is https://tomopt.com/tomlab/ : pass in a function for the constraints, objectives, matrices (or matrix free operators at some point) for the linear constraints, and settings... essentially DSL-free.

We can discuss further in slack to see if a NLP interface based on MOI is possible and if there are people who could help design it. I am not sure I have access to people with sufficient expertise (BTW, you had previously mentioned JuliaSmoothOptimizers to me, but I never was able to navigate it to see the interface they were proposing).

@stevengj
Copy link
Member

stevengj commented Mar 6, 2019

Rather than min_objective!(opt, f, autodiff=:forward), I would suggest something like min_objective!(opt, autodiff(f)), where autodiff(f) takes a function f(x) and returns an (x, grad) -> ... function of the form expected by nlopt.

Then you are just adding a new autodiff function and don't need to modify the existing functions at all. In principle, the autodiff function (or whatever it is called) could even be in an a separate package, e.g. in NLSolversBase.

@longemen3000
Copy link

longemen3000 commented Sep 2, 2019

something along these lines? (i use this to add forward differenciation support in my code)

function nlopt_form(f,x,g,diffresult_cache)
    if length(g) > 0 
        ForwardDiff.gradient!(diffresult_cache,f,x)
        g .= DiffResults.gradient(diffresult_cache)
        return DiffResults.value(diffresult_cache)
    else
        return f(x)
    end
end

using that as a base,

function autodiff(f)
    res =(x,grad=[]) ->begin
        if length(grad) > 0 
            diffresult_cache = DiffResults.GradientResult(x) 
            ForwardDiff.gradient!(diffresult_cache,f,x)
            grad .= DiffResults.gradient(diffresult_cache)
            return DiffResults.value(diffresult_cache)
        else
            return f(x)
        end
    end
    return res 
    end

the problem to adding this to NLsolversBase is that the check for the existence of the gradient is different. here checks for length, Optim checks for nothing)

@longemen3000
Copy link

an improved version, works with the future NLsolvers and NLopt

function autodiff(fn)   
    function f(x) 
        return fn(x) 
    end
    
    function f(∇f, x)
        if !(∇f == nothing) && (length(∇f) != 0)
            ForwardDiff.gradient!(∇f,fn,x)
        end
    
        fx = f(x)
        return ∇f == nothing ? fx : (fx, ∇f)
    end

    function f(∇²f, ∇f, x)
        if !(∇²f == nothing)
            ForwardDiff.hessian!(∇f,fn,x)
        end
    
        if (∇f == nothing) && (length(∇f) == 0) && (∇²f == nothing)
            fx = f(∇f, x)
            return fx
        elseif ∇²f == nothing
            return fx = f(∇f, x)
        else
            fx, ∇f = fx = f(∇f, x)
            return fx, ∇f, ∇²f
        end
    end
    return f
end

@tbeason
Copy link

tbeason commented Feb 15, 2022

Just as an update here, I found the above to not be working (perhaps it did when posted). Here is what I found to work:

function autodiff(fn)
# adapted from here https://github.com/JuliaOpt/NLopt.jl/issues/128
	function f(x) 
		return fn(x) 
	end

	function f(x,∇f)
		if !(∇f == nothing) && (length(∇f) != 0)
			ForwardDiff.gradient!(∇f,fn,x)
		end
	
		fx = fn(x)
		return fx
	end
	return f
end

@tpapp
Copy link

tpapp commented Oct 10, 2023

I use this in practice:

import AbstractDifferentiation as AD

function gradient_on_demand(ad_backend, f)
    function(x, ∇fx)
        if isempty(∇fx)
            f(x)
        else
            v, (g,) = AD.value_and_gradient(ad_backend, f, x)
            copy!(∇fx, g)
            v
        end
    end
end

Example:

using NLopt, ForwardDiff
ab = AD.ForwardDiffBackend()
opt = Opt(:LD_SLSQP, 3)
equality_constraint!(opt, gradient_on_demand(ab, x -> sum(x) - 3), 1e-4)
opt.min_objective = gradient_on_demand(ab, x -> sum(abs2, x))
(minf, minx, ret) = optimize(opt, [-1.0, 1.5, 9.9])

Would be happy to make a PR, AbstractDifferentiation is a not a heavy dependency (per se). And Enzyme support is WIP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants