Broadcast + Zygote: surprising types from Int32 input #1445

jeremiahpslewis · 2023-08-08T17:15:13Z

For a function with Float32 return type, the gradient type returned is Vector{Float64} for the broadcasted function, but Float32 for the non-broadcast version.

using Zygote

c = Int32[1, 2, 3]

function f3(a)
    b = Float32[1, 2, 3]
    sum(a .* b)
end

f3(c) # Float32 return type

f3'(c) # Vector{Float64} gradient -> desirable would be Float32

First identified here: JuliaGPU/GPUArrays.jl#484

Non-broadcast version, which works as expected:

using Zygote

c = 3f0

function f3(a)
    b = Int64(1)
    sum(a .* b)
end

f3(c) # Float32 return type

f3'(c) # Float32

ToucheSir · 2023-08-09T01:45:10Z

This is not a type instability issue (the code is 100% type stable), but a weird edge case of how promotion rules work in Julia that I wasn't aware of. In brief, the scalar version relies on promote and by extension promote_rule, which stipulates that that <floating point type> plus <int type> -> <floating point type>. In contrast, the array version uses the ProjectTo machinery in ChainRulesCore to ensure correct types are maintained for AD. This calls float(Int32), which ends up returning Float64 for all core (U)Int types!

Here are a couple ideas for addressing this, ranked in order of difficulty:

Integer types technically aren't considered differentiable by many people, so converting your Ints to floats and differentiating wrt them would be less likely to hit any unwritten behaviour there.
You could ask over on the ChainRulesCore side whether the promotion rules could be tweaked. I notice there are currently no tests for ProjectTo(::Int32)(::Float32), so it may just be a missed edge case.
Zygote could add custom paths to its internal projection machinery (which currently mostly wraps ChainRulesCore's) to handle cases like these. I rank this most difficult because we'd likely be reinventing the wheel in a less well-maintained and non-reusable way.

jeremiahpslewis · 2023-08-09T12:32:52Z

Thanks for the very clear explanation! I'll stick with 1 for now, the other two are, as you suggest, relatively complex.

jeremiahpslewis changed the title ~~Float32 function return type, but Vector{Float64} gradient type (breaks some GPU code due to type promotion to unsupported type)~~ Broadcast + Zygote = Weird Type Instability? Aug 8, 2023

ToucheSir added the ChainRules adjoint -> rrule, and further integration label Aug 9, 2023

jeremiahpslewis mentioned this issue Aug 9, 2023

Let user specify Integer ->Float conversion type JuliaDiff/ChainRulesCore.jl#628

Closed

mcabbott changed the title ~~Broadcast + Zygote = Weird Type Instability?~~ Broadcast + Zygote: surprising types from Int32 input Aug 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broadcast + Zygote: surprising types from Int32 input #1445

Broadcast + Zygote: surprising types from Int32 input #1445

jeremiahpslewis commented Aug 8, 2023 •

edited

ToucheSir commented Aug 9, 2023

jeremiahpslewis commented Aug 9, 2023

Broadcast + Zygote: surprising types from Int32 input #1445

Broadcast + Zygote: surprising types from Int32 input #1445

Comments

jeremiahpslewis commented Aug 8, 2023 • edited

ToucheSir commented Aug 9, 2023

jeremiahpslewis commented Aug 9, 2023

jeremiahpslewis commented Aug 8, 2023 •

edited