Directly call `ForwardDiff` to compute the hessian #1270

YichengDWu · 2022-07-23T00:43:17Z

Zygote.forward_jacobian is not well maintained and the motivation for using it is not clear.

YichengDWu · 2022-07-23T00:52:28Z

Reverse mode over the Hessian

julia> function f1(x, ps)  # [edit: renamed not to clash]
              hess = Zygote.hessian(x->sum(x.^3), x)
                     return hess * x .+ ps.bias
                            end
f1 (generic function with 1 method)

julia> x = rand(3);

julia> ps = (;bias = rand(3));

julia> Zygote.gradient(p -> sum(f1(x,p)), ps)
((bias = Fill(1.0, 3),),)

Forward mode over the Hessian

julia> using ComponentArrays, ForwardDiff

julia> ForwardDiff.gradient(p -> sum(f1(x,p)), ComponentArray(ps))
ComponentVector{Float64}(bias = [1.0, 1.0, 1.0])

ToucheSir · 2022-07-23T04:27:42Z

Looks like it wasn't a complicated fix after all! Could you add one of the examples from #1264 or elsewhere as a regression test?

YichengDWu · 2022-07-23T04:58:00Z

using ForwardDiff, Zygote, BenchmarkTools
f(x,W) = sum((W.^3)*x)

x = randn(30);
W = randn(128,30);

@benchmark Zygote.hessian(W->f($x,W), $W)
BenchmarkTools.Trial: 13 samples with 1 evaluation.
 Range (min … max):  210.955 ms … 555.882 ms  ┊ GC (min … max):  6.64% … 41.61%
 Time  (median):     312.789 ms               ┊ GC (median):    13.30%
 Time  (mean ± σ):   386.835 ms ± 135.648 ms  ┊ GC (mean ± σ):  30.41% ± 16.92%

  ▁       ▁█  ▁▁    ▁                             ▁    ▁█     █  
  █▁▁▁▁▁▁▁██▁▁██▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁██▁▁▁▁▁█ ▁
  211 ms           Histogram: frequency by time          556 ms <

 Memory estimate: 851.09 MiB, allocs estimate: 8324.

hessian_dual(f, x::AbstractArray) = ForwardDiff.jacobian(x -> gradient(f, x)[1], x)

@benchmark hessian_dual(W->f($x,W),$W)
BenchmarkTools.Trial: 34 samples with 1 evaluation.
 Range (min … max):  107.892 ms … 426.180 ms  ┊ GC (min … max): 11.03% … 51.18%
 Time  (median):     133.778 ms               ┊ GC (median):    12.17%
 Time  (mean ± σ):   149.663 ms ±  62.519 ms  ┊ GC (mean ± σ):  19.00% ± 10.50%

     ▆██▄                                                        
  ▄▄▆████▄▁▄▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  108 ms           Histogram: frequency by time          426 ms <

 Memory estimate: 607.80 MiB, allocs estimate: 9604.

YichengDWu · 2022-07-23T21:40:43Z

Great acceleration!

DhairyaLGandhi

Lgtm with the regression tests. The best fix would be to identify where the regression came from and fix that.

ToucheSir · 2022-07-24T02:31:06Z

Sorry, I meant to add a test which fails on the current implementation but works on this PR. IIRC you had a couple, @MilkshakeForReal.

DhairyaLGandhi · 2022-07-24T17:04:12Z

It's also possible that we might lose out on some code paths that worked with Zygote via `forward_jacobian` since we could switch it out to using Zygote without breaking API.

…

On Sun, Jul 24, 2022, 08:01 Brian Chen ***@***.***> wrote: Sorry, I meant to add a test which fails on the current implementation but works on this PR. IIRC you had a couple, @MilkshakeForReal <https://github.com/MilkshakeForReal>. — Reply to this email directly, view it on GitHub <#1270 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJOZVVI3LHX4XXKBPBVF6XTVVSTHJANCNFSM54NECYWA> . You are receiving this because you commented.Message ID: <FluxML/Zygote. ***@***.***>

YichengDWu · 2022-07-26T01:26:21Z

Related to #1070

edit: don't close it since hessian_inverse is also an issue there.

julia> using CUDA

julia> CUDA.allowscalar(false)

julia> hessian(x -> sum(tanh.(x)), cu([1,2,3.4]))
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:33
  [2] assertscalar(op::String)
    @ GPUArraysCore C:\Users\Luffy\.julia\packages\GPUArraysCore\rSIl2\src\GPUArraysCore.jl:78
  [3] getindex
    @ C:\Users\Luffy\.julia\packages\GPUArrays\gok9K\src\host\indexing.jl:9 [inlined]
  [4] extract(xs::CuArray{ForwardDiff.Dual{Nothing, Float32, 3}, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:23
  [5] forward_jacobian(f::Zygote.var"#100#101"{var"#27#28"}, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, #unused#::Val{3})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:29
  [6] forward_jacobian(f::Function, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}; chunk_threshold::Int64)
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:44
  [7] forward_jacobian
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:43 [inlined]
  [8] hessian_dual
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\grad.jl:76 [inlined]
  [9] hessian(f::Function, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\grad.jl:74
 [10] top-level scope
    @ REPL[135]:1
 [11] top-level scope
    @ C:\Users\Luffy\.julia\packages\CUDA\DfvRa\src\initialization.jl:52

julia> hessian_dual(x -> sum(tanh.(x)), cu([1,2,3.4]))
3×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
 -0.6397   0.0        0.0
  0.0     -0.136219   0.0
  0.0      0.0       -0.00887072

YichengDWu · 2022-07-26T01:28:00Z

I'm digging into the issues and haven't found one case where forward_jacobian works but not ForwardDiff.jacobian. It's always the other way around

YichengDWu · 2022-07-26T01:49:28Z

This issue #305 is closed but I'm still getting an error

using Zygote
f(x) = [cos(x[1]) * sin(x[2]), sin(x[1]) * cos(x[2])]
jac(x) = last(Zygote.forward_jacobian(f, x))
g(x) = sum(jac(x))
x = [π/4, π/3]
Zygote.gradient(g, x)

julia> Zygote.gradient(g, x)
ERROR: Mutating arrays is not supported -- called setindex!(Matrix{Float64}, ...)
This error occurs when you ask Zygote to differentiate operations that change
the elements of arrays in place (e.g. setting values with x .= ...)

Possible fixes:
- avoid mutating operations (preferred)
- or read the documentation and solutions for this error
  https://fluxml.ai/Zygote.jl/dev/limitations.html#Array-mutation-1

Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:33
  [2] _throw_mutation_error(f::Function, args::Matrix{Float64})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\array.jl:70
  [3] (::Zygote.var"#444#445"{Matrix{Float64}})(#unused#::Nothing)
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\array.jl:82
  [4] (::Zygote.var"#2496#back#446"{Zygote.var"#444#445"{Matrix{Float64}}})(Δ::Nothing)
    @ Zygote C:\Users\Luffy\.julia\packages\ZygoteRules\AIbCs\src\adjoint.jl:67
  [5] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:31 [inlined]
  [6] (::typeof(∂(forward_jacobian)))(Δ::Tuple{Nothing, FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
  [7] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:44 [inlined]
  [8] Pullback
    @ C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:43 [inlined]
  [9] (::typeof(∂(forward_jacobian)))(Δ::Tuple{Nothing, FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
 [10] Pullback
    @ .\REPL[190]:1 [inlined]
 [11] (::typeof(∂(jac)))(Δ::FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface2.jl:0
 [12] Pullback
    @ .\REPL[191]:1 [inlined]
 [13] (::Zygote.var"#60#61"{typeof(∂(g))})(Δ::Float64)
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface.jl:41
 [14] gradient(f::Function, args::Vector{Float64})
    @ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\compiler\interface.jl:76
 [15] top-level scope
    @ REPL[193]:1
 [16] top-level scope
    @ C:\Users\Luffy\.julia\packages\CUDA\DfvRa\src\initialization.jl:52

Now try ForwardDiff.jacobian

julia> jac(x) = last(ForwardDiff.jacobian(f, x))
jac (generic function with 1 method)

julia> Zygote.gradient(g, x)
([-0.6123724356957946, -0.3535533905932738],)

test/utils.jl

mcabbott · 2022-07-26T03:26:25Z

src/lib/grad.jl

@@ -73,8 +73,7 @@ julia> hessian(sin, pi/2)
 """
 hessian(f, x) = hessian_dual(f, x)

-hessian_dual(f, x::AbstractArray) = forward_jacobian(x -> gradient(f, x)[1], x)[2]
-
+hessian_dual(f, x::AbstractArray) = ForwardDiff.jacobian(x -> gradient(f, x)[1], x)


The example you test relies on the gradient rule for ForwardDiff.jacobian, which assumes its f is pure. This is true here.

It would be worth trying to look at some examples where this isn't true, i.e. f closes over some parameters (either arrays or numbers). In this case their gradients are not traced. I believe you should get a warning from ForwardDiff.jacobian's rule.

Are there any further surprises? Anything which could be tested?

Did you mean nn layers from Flux? I will dig in a bit tomorrow.

The scalar indexing in the above comment can be added to testing as well.

vpuri3 · 2023-10-25T18:32:45Z

is this still active?

vpuri3 · 2023-10-25T20:01:22Z

@YichengDWu did this solve the GPU issue?

YichengDWu · 2023-10-26T09:13:21Z

@vpuri3 I'm no longer active in the Julia community. However, the "fix" in this PR is simple, you can surely test it out.

vpuri3 · 2023-10-26T13:43:03Z

Thanks for letting me know @YichengDWu. The change works with Lux.jl, but I'm still getting scalar indexing errors on the GPU.

vpuri3 · 2023-10-26T16:59:44Z

with the change in this PR, this code is working:

using Random
using Lux, CUDA, LuxCUDA, ComponentArrays
using Zygote, ForwardDiff

CUDA.allowscalar(false)

#==========================#
function testhessian(
    NN::Lux.AbstractExplicitLayer,
    data::Tuple;
    device = cpu_device(),
)
    p, st = Lux.setup(Random.default_rng(), NN)

    st = Lux.testmode(st)
    p = ComponentArray(p)

    xdata, ydata = data |> device
    p, st = (p, st)     |> device

    function loss(optx)
        ypred, _ = NN(xdata, optx, st)

        sum(abs2, ydata - ypred)
    end

    g(p) = Zygote.gradient(loss, p)[1]
    H(p) = ForwardDiff.jacobian(g, p)

    Zygote.hessian(loss, p)
end
#==========================#
NN = Chain(Dense(1, 3), Dense(3, 1))

data = ntuple(_ -> rand(1, 10), 2)
device = Lux.gpu_device()

H = testhessian(NN, data; device)

julia> include("hess.jl")
10×10 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
  0.236781  -0.075257    -1.20583    0.31846   -0.101217    -1.62179   -0.713834    0.503548  -1.14138     1.98508
 -0.075257   0.0239192    0.383253  -0.101217   0.0321702    0.515458   0.0296168  -0.780695   0.362769   -0.630924
 -1.20583    0.383253     6.1408    -1.62179    0.515458     8.2591     0.474545   -2.56436    5.19194   -10.1092
  0.318461  -0.101217    -1.62179    0.514738  -0.163601    -2.62135   -2.09317     0.677249  -1.53511     3.20854
 -0.101217   0.0321702    0.515458  -0.163601   0.0519977    0.833151   0.0398333  -2.18309    0.487909   -1.01978
 -1.62179    0.515458     8.2591    -2.62135    0.833151    13.3494     0.638242   -3.44895    5.84984   -16.3398
 -0.713834   0.0296168    0.474545  -2.09317    0.0398333    0.638242   0.0366717  -0.198167   0.449183   -0.781213
  0.503548  -0.780695    -2.56436    0.677249  -2.18309     -3.44895   -0.198167    1.07086   -2.4273      4.22154
 -1.14138    0.362769     5.19194   -1.53511    0.487909     5.84984    0.449183   -2.4273     5.50193    -9.56889
  1.98508   -0.630924   -10.1092     3.20854   -1.01978    -16.3398    -0.781213    4.22154   -9.56889    20.0

(hess) pkg> st
Status `~/.julia/dev/GeometryLearning.jl/hess/Project.toml`
  [052768ef] CUDA v5.0.0
  [b0b7db55] ComponentArrays v0.15.4
  [f6369f11] ForwardDiff v0.10.36
  [b2108857] Lux v0.5.8
  [d0bbae9a] LuxCUDA v0.3.1
  [e88e6eb3] Zygote v0.6.67 `~/.julia/dev/Zygote`

vpuri3 · 2023-10-26T17:11:27Z

But Lux.Embedding is failing. This looks like an NNlib issue with ForwardDiff.Dual

using Random
using Lux, CUDA, LuxCUDA, ComponentArrays
using Zygote, ForwardDiff

CUDA.allowscalar(false)

#==========================#
function testhessian(
    NN::Lux.AbstractExplicitLayer,
    data::Tuple;
    device = cpu_device(),
)
    p, st = Lux.setup(Random.default_rng(), NN)

    st = Lux.testmode(st)
    p = ComponentArray(p)

    xdata, ydata = data |> device
    p, st = (p, st)     |> device

    function loss(optx)
        ypred, _ = NN(xdata, optx, st)

        sum(abs2, ydata - ypred)
    end

    g(p) = Zygote.gradient(loss, p)[1]
    H(p) = ForwardDiff.jacobian(g, p)

    Zygote.hessian(loss, p)
end
#==========================#
NN = Chain(Dense(1, 3), Dense(3, 1))

E, K = 1, 10
NN = Chain(Embedding(E => 3), Dense(3, 1))
data = (ones(Int32, K), rand(1, K))

H = testhessian(NN, data; device)

julia> include("hess.jl")                                                                                                                                               
ERROR: LoadError: InvalidIRError: compiling MethodInstance for NNlibCUDAExt.scatter_kernel!(::typeof(+), ::CuDeviceMatrix{…}, ::CuDeviceMatrix{…}, ::CuDeviceVector{…}, 
::Int64, ::Int64, ::Tuple{…}) resulted in invalid LLVM IR                                                                                                               
Reason: unsupported call to an unknown function (call to julia.new_gc_frame)                                                                                            
Reason: unsupported call to an unknown function (call to julia.push_gc_frame)                                                                                           
Reason: unsupported call to an unknown function (call to julia.pop_gc_frame)                                                                                            
Reason: unsupported call to an unknown function (call to julia.get_gc_frame_slot)                                                                                       
Reason: unsupported dynamic function invocation (call to atomic_cas!)                                                                                                   
Stacktrace:                                                                                                                                                             
 [1] atomic_op!                                                                                                                                                         
   @ ~/.julia/packages/CUDA/nbRJk/src/device/intrinsics/atomics.jl:228                                                                                                  
 [2] atomic_arrayset                                                                                                                                                    
   @ ~/.julia/packages/CUDA/nbRJk/src/device/intrinsics/atomics.jl:468                                                                                                  
 [3] atomic_arrayset                                                                                                                                                    
   @ ~/.julia/packages/CUDA/nbRJk/src/device/intrinsics/atomics.jl:440                                                                                                  
 [4] macro expansion                                                                                                                                                    
   @ ~/.julia/packages/CUDA/nbRJk/src/device/intrinsics/atomics.jl:435                                                                                                  
 [5] scatter_kernel!                                                                                                                                                    
   @ ~/.julia/packages/NNlib/5iRSB/ext/NNlibCUDAExt/scatter.jl:28                                                                                                       
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl                                      
Stacktrace:                                                                         
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/validation.jl:147                                                                                             
  [2] macro expansion                                                               
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:440 [inlined]                                                                                       
  [3] macro expansion                                                                                                                                                   
    @ GPUCompiler ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]                                                                                 
  [4] macro expansion                                                               
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:439 [inlined]
  [5] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/utils.jl:92                                                                                                   
  [6] emit_llvm                                                                                                                                                         
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/utils.jl:86 [inlined]                                                                                         
  [7]                                                                               
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:129                                                                                                 
  [8] codegen                                                                       
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:110 [inlined]                                                                                       
  [9] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Boo
l)                                                                                  
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:106                                                                                                 
 [10] compile                                                                       
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:98 [inlined]                                                                                        
 [11] #1042                                                                                                                                                             
    @ GPUCompiler ~/.julia/packages/CUDA/nbRJk/src/compiler/compilation.jl:166 [inlined]
 [12] JuliaContext(f::CUDA.var"#1042#1045"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}})                                            
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/driver.jl:47              
 [13] compile(job::GPUCompiler.CompilerJob)                          
    @ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/compilation.jl:165             
 [14] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(CUDA.compile), linker::typeof(CUD
A.link))                                                                                                                                                                    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/execution.jl:125                                                                                              
 [15] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)                            
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2mJjc/src/execution.jl:103                                                                                              
 [16] macro expansion                                                                                                                                                   
    @ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:323 [inlined]                                                                                         
 [17] macro expansion                                                                                                                                                   
    @ CUDA ./lock.jl:267 [inlined]                                                                                                                                      
 [18] cufunction(f::typeof(NNlibCUDAExt.scatter_kernel!), tt::Type{Tuple{…}}; kwargs::@Kwargs{})                                                                        
    @ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:318                                                                                                   
 [19] cufunction                                                                    
    @ NNlibCUDAExt ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:315 [inlined]                                                                                 
 [20] macro expansion                                                                                                                                                   
    @ NNlibCUDAExt ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:104 [inlined]                                                                                 
 [21] scatter!(op::typeof(+), dst::CuArray{…}, src::CuArray{…}, idx::CuArray{…})                                                                                        
    @ NNlibCUDAExt ~/.julia/packages/NNlib/5iRSB/ext/NNlibCUDAExt/scatter.jl:58     
 [22] ∇gather_src                                                                                                                                                           @ Zygote ~/.julia/packages/NNlib/5iRSB/src/gather.jl:131 [inlined]                                                                                                  
 [23] gather!_pullback                                                                                                                                                  
    @ Zygote ~/.julia/packages/NNlib/5iRSB/src/gather.jl:136 [inlined]                                                                                                  
 [24] ZBack                                                                                                                                                             
    @ Zygote ~/.julia/dev/Zygote/src/compiler/chainrules.jl:211 [inlined]                                                                                                [25] gather                                                                                                                                                            
    @ Zygote ~/.julia/packages/NNlib/5iRSB/src/gather.jl:46 [inlined]                                                                                                   
 [26] Embedding                                                                     
    @ Zygote ~/.julia/packages/Lux/Al3Ab/src/layers/basic.jl:490 [inlined]                                                                                              
 [27] apply                                                                                                                                                             
    @ Zygote ~/.julia/packages/LuxCore/aumFq/src/LuxCore.jl:115 [inlined]                                                                                                [28] macro expansion                                                                                                                                                   
    @ Zygote ~/.julia/packages/Lux/Al3Ab/src/layers/containers.jl:0 [inlined]                                                                                           
 [29] applychain                                                                                                                                                        
    @ Zygote ~/.julia/packages/Lux/Al3Ab/src/layers/containers.jl:480 [inlined]                                                                                         
 [30] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Tuple{CuArray{…}, Nothing})                                                                                            
    @ Zygote ~/.julia/dev/Zygote/src/compiler/interface2.jl:0                                                                                                           
 [31] Chain                                                                                                                                                             
    @ Zygote ~/.julia/packages/Lux/Al3Ab/src/layers/containers.jl:478 [inlined]                                                                                         
 [32] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Tuple{CuArray{…}, Nothing})                                                                                                @ Zygote ~/.julia/dev/Zygote/src/compiler/interface2.jl:0                                                                                                           
 [33] loss                                                                                                                                                              
    @ Zygote ~/.julia/dev/GeometryLearning.jl/hess/hess.jl:23 [inlined]                                                                                                 
 [34] (::Zygote.Pullback{Tuple{var"#loss#34"{…}, ComponentVector{…}}, Any})(Δ::ForwardDiff.Dual{ForwardDiff.Tag{Zygote.var"#125#126"{…}, Float32}, Float32, 7})         
    @ Zygote ~/.julia/dev/Zygote/src/compiler/interface2.jl:0

vpuri3 · 2023-10-26T17:43:15Z

I narrowed the error in the above case to the MWE in FluxML/NNlib.jl#547. That doesn't seem related to this PR.

Update grad.jl

f68422d

YichengDWu changed the title ~~Update grad.jl~~ Directly call ForwardDiff to compute the hessian Jul 23, 2022

DhairyaLGandhi approved these changes Jul 23, 2022

View reviewed changes

mcabbott added the second order zygote over zygote, or otherwise label Jul 25, 2022

add testing

2ebf9a0

mcabbott reviewed Jul 26, 2022

View reviewed changes

change names

f88b343

marius311 mentioned this pull request Jan 3, 2023

hessians/jacobians trigger scalar indexing for CuArrays #1348

Open

CarloLucibello closed this Jan 3, 2023

CarloLucibello reopened this Jan 3, 2023

vpuri3 mentioned this pull request Oct 26, 2023

use ForwardDiff.jacobian in place of Zygote.forward_jacobian #1468

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directly call `ForwardDiff` to compute the hessian #1270

Directly call `ForwardDiff` to compute the hessian #1270

YichengDWu commented Jul 23, 2022 •

edited

YichengDWu commented Jul 23, 2022

ToucheSir commented Jul 23, 2022

YichengDWu commented Jul 23, 2022 •

edited

YichengDWu commented Jul 23, 2022

DhairyaLGandhi left a comment •

edited

ToucheSir commented Jul 24, 2022

DhairyaLGandhi commented Jul 24, 2022 via email

YichengDWu commented Jul 26, 2022 •

edited

YichengDWu commented Jul 26, 2022

YichengDWu commented Jul 26, 2022

mcabbott Jul 26, 2022

YichengDWu Jul 26, 2022

vpuri3 commented Oct 25, 2023

vpuri3 commented Oct 25, 2023

YichengDWu commented Oct 26, 2023

vpuri3 commented Oct 26, 2023

vpuri3 commented Oct 26, 2023 •

edited

vpuri3 commented Oct 26, 2023 •

edited

vpuri3 commented Oct 26, 2023

Directly call ForwardDiff to compute the hessian #1270

Are you sure you want to change the base?

Directly call ForwardDiff to compute the hessian #1270

Conversation

YichengDWu commented Jul 23, 2022 • edited

YichengDWu commented Jul 23, 2022

ToucheSir commented Jul 23, 2022

YichengDWu commented Jul 23, 2022 • edited

YichengDWu commented Jul 23, 2022

DhairyaLGandhi left a comment • edited

Choose a reason for hiding this comment

ToucheSir commented Jul 24, 2022

DhairyaLGandhi commented Jul 24, 2022 via email

YichengDWu commented Jul 26, 2022 • edited

YichengDWu commented Jul 26, 2022

YichengDWu commented Jul 26, 2022

mcabbott Jul 26, 2022

Choose a reason for hiding this comment

YichengDWu Jul 26, 2022

Choose a reason for hiding this comment

vpuri3 commented Oct 25, 2023

vpuri3 commented Oct 25, 2023

YichengDWu commented Oct 26, 2023

vpuri3 commented Oct 26, 2023

vpuri3 commented Oct 26, 2023 • edited

vpuri3 commented Oct 26, 2023 • edited

vpuri3 commented Oct 26, 2023

Directly call `ForwardDiff` to compute the hessian #1270

Directly call `ForwardDiff` to compute the hessian #1270

YichengDWu commented Jul 23, 2022 •

edited

YichengDWu commented Jul 23, 2022 •

edited

DhairyaLGandhi left a comment •

edited

YichengDWu commented Jul 26, 2022 •

edited

vpuri3 commented Oct 26, 2023 •

edited

vpuri3 commented Oct 26, 2023 •

edited