Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this code a supported use? (single-pass value and derivative) #610

Open
gerlero opened this issue Nov 24, 2022 · 6 comments · May be fixed by #678
Open

Is this code a supported use? (single-pass value and derivative) #610

gerlero opened this issue Nov 24, 2022 · 6 comments · May be fixed by #678

Comments

@gerlero
Copy link

gerlero commented Nov 24, 2022

Considering these two methods that compute the value and first derivative of a scalar function in a single pass:

import ForwardDiff
import DiffResults

@inline function value_and_derivative(f, Y::Type, x::Real)
    diffresult = ForwardDiff.derivative!(DiffResults.DiffResult(zero(Y), zero(Y)), f, x)
    return DiffResults.value(diffresult), DiffResults.derivative(diffresult)
end

@inline function value_and_derivative(f, x::Real)
    T = typeof(ForwardDiff.Tag(f, typeof(x)))
    ydual = f(ForwardDiff.Dual{T}(x, one(x)))
    return ForwardDiff.value(T, ydual), ForwardDiff.extract_derivative(T, ydual)
end
  • The first method seems to be the officially recommended way to do this. However, (1) it introduces the DiffResults dependency just for this, and (2) it needlessly requires the caller to specify f's return type ahead of the call. Neither are dealbreakers, but IMO they add friction for something that looks like it shouldn't have it (intuition says that the value comes for free when computing a derivative with ForwardDiff)

  • The second method uses the Tag and Dual types as well as the extract_derivative function, which are not listed in the "Differentiation API" in the docs, so I'm not sure if they're considered part of the stable public API

Both methods run equally fast (and significantly faster than a naive two-pass implementation), so my question is: does the second method constitute a supported use of ForwardDiff's public API?

If such an use isn't supported (but maybe even if it is so—both method implementations appear too convoluted for a pretty common use case IMO), I'd like to suggest adding a value_and_derivative function with the second method to either this or one of the other packages in JuliaDiff (I'm willing to write a PR).

Related: #401, #391

EDITS: y-> ydual, ydual.value -> ForwardDiff.value(T, ydual), add another related issue

@devmotion
Copy link
Member

I can't comment on official API and design questions here, at least not in an official way. However, a quick note on

However, (1) it introduces the DiffResults dependency just for this

DiffResults is a dependency of ForwardDiff, so it's a dependency anyway, regardless of whether you load it or not. You might even be able to load ForwardDiff.DiffResults or ForwardDiff.DiffResult directly.

@gerlero
Copy link
Author

gerlero commented Nov 24, 2022

DiffResults is a dependency of ForwardDiff, so it's a dependency anyway, regardless of whether you load it or not. You might even be able to load ForwardDiff.DiffResults or ForwardDiff.DiffResult directly.

Thanks for the comment. That means that no extra packages are installed just for this simple use case (which is a good thing!). Unfortunately, it doesn't mean that one can avoid having to explicitly add DiffResults as a dependency (unless this is part of the public API). Honestly, I would care a lot less about this if I were to find a better solution to my other point (putting it another way, if DiffResults were less constraining for this simple case; or if I didn't have to use it at all).

Regarding that other point (i.e., the fact that, when using DiffResults, the return type must be known before the call), I'm thinking that an alternative to my initial suggestion could be to add a ForwardDiff.derivative[!] method that also returns a DiffResult but does not require a DiffResult as an input, for use in this scenario.

EDIT: By that I mean a adding a new method like this:

import ForwardDiff
import DiffResults

@inline function ForwardDiff.derivative!(::Nothing, f, x::Real) # Or just value_and_derivative(f, x::Real)
    T = typeof(ForwardDiff.Tag(f, typeof(x)))
    ydual = f(ForwardDiff.Dual{T}(x, one(x)))
    return DiffResults.DiffResult(ForwardDiff.value(T, ydual), ForwardDiff.extract_derivative(T, ydual))
end

The ::Nothing parameter used for dispatch could be replaced with some other type (even some kind of "empty" DiffResults object) for the same purpose.

Note: a quick (and definitely non-scientific) benchmark I did showed that wrapping the value and derivative in a DiffResult object carries an overhead, so I'd still prefer to use a function that just returns a plain tuple. (After running the same benchmark again many times, I no longer see a difference).

@gerlero
Copy link
Author

gerlero commented Nov 28, 2022

I've found a couple of implementations of the second method in SciML packages, so there's definitely demand for such a method, as well as some precedent of treating Dual (and related types/functions) as part of ForwardDiff's API.

@devmotion
Copy link
Member

Small caveat is that SimpleNonLinearSolve was extracted and to a large extent copied from NonLinearSolve just some days ago, so the links are basically a single example of the second method.

@thomvet
Copy link
Contributor

thomvet commented Dec 18, 2022

I am doing something similar in a private package. I don’t think it’s official API, so I just have a few tests that will be indicative when things break.

I guess, whether you are fine with something like this depends on your risk persona and application area.

@gerlero
Copy link
Author

gerlero commented Mar 6, 2023

@thomvet Well, I ended up doing the same. I still hope this type of usage can be included in the public API (or clarified as such if it's already meant to be API) so as to be able to avoid future breakage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants