Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocations in broadcasting ^ #189

Open
johnomotani opened this issue Nov 26, 2021 · 1 comment
Open

Allocations in broadcasting ^ #189

johnomotani opened this issue Nov 26, 2021 · 1 comment
Labels

Comments

@johnomotani
Copy link
Contributor

johnomotani commented Nov 26, 2021

Applying ^ to a NamedDimsArray causes allocations, when applying to Array is 0-alloc. Using current master of NamedDims.jl and julia-1.6.4:

julia> using NamedDims, BenchmarkTools

julia> const x = NamedDimsArray{(:z,)}([2.0, 3.0]); const xout = copy(x);

julia> const y = [2.0, 3.0]; const yout = copy(y);

julia> function testpow2(out, a)
           @. out = a ^ 2
           return nothing
       end
testpow2 (generic function with 1 method)

julia> @btime testpow2($xout, $x)
  26.738 ns (2 allocations: 16 bytes)

julia> @btime testpow2($yout, $y)
  10.931 ns (0 allocations: 0 bytes)

The allocations go away if I replace a ^ 2 with a ^ 5.7, so I guess it's related to some special optimized branch of the ^ operator.

@mcabbott
Copy link
Collaborator

That's pretty odd. Note that it depends on both arguments:

julia> @btime testpow2($xout, $x)  # NDA -> NDA
  min 17.285 ns, mean 19.045 ns (2 allocations, 16 bytes. GC mean 3.63%)

julia> @btime testpow2($yout, $x)
  min 9.301 ns, mean 11.145 ns (2 allocations, 16 bytes. GC mean 6.03%)

julia> @btime testpow2($xout, $y)
  min 10.511 ns, mean 10.734 ns (0 allocations)
  
julia> @btime testpow2($yout, $y)  # A -> A
  min 3.042 ns, mean 3.141 ns (0 allocations)

And that there's no obvious type instability in the broadcasting:

julia> testpow2(a) = a .^ 2
testpow2 (generic function with 2 methods)

julia> @code_warntype testpow2(x)
MethodInstance for testpow2(::NamedDimsArray{(:z,), Float64, 1, Vector{Float64}})
  from testpow2(a) in Main at REPL[50]:1
Arguments
  #self#::Core.Const(testpow2)
  a::NamedDimsArray{(:z,), Float64, 1, Vector{Float64}}
Body::NamedDimsArray{(:z,), Float64, 1, Vector{Float64}}
1%1 = Core.apply_type(Base.Val, 2)::Core.Const(Val{2})
│   %2 = (%1)()::Core.Const(Val{2}())
│   %3 = Base.broadcasted(Base.literal_pow, Main.:^, a, %2)::Base.Broadcast.Broadcasted{NamedDims.NamedDimsStyle{Base.Broadcast.DefaultArrayStyle{1}}, Nothing, typeof(Base.literal_pow), Tuple{Base.RefValue{typeof(^)}, NamedDimsArray{(:z,), Float64, 1, Vector{Float64}}, Base.RefValue{Val{2}}}}
│   %4 = Base.materialize(%3)::NamedDimsArray{(:z,), Float64, 1, Vector{Float64}}
└──      return %4

Here Base.literal_pow is the special version of ^, which should just call multiplication for small integer powers. Multiplication has a slowdown too, although no allocations:

julia> @btime $xout .= $x .* $x;
  min 21.815 ns, mean 22.020 ns (0 allocations)

julia> @btime $yout .= $y .* $y;
  min 4.416 ns, mean 4.512 ns (0 allocations)

@oxinabox oxinabox added the performance (compute) Gotta go fast label Nov 30, 2021
@mcabbott mcabbott changed the title Allocations in ^ Allocations in broadcasting ^ Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants