Micro optimization for offset array #222

Moelf · 2023-02-26T16:03:13Z

often when we go from 0-index to 1-index and use ArrayOfArrays, we have some patterns like this:

UnROOT.jl/src/RNTuple/fieldcolumn_reading.jl

Lines 107 to 115 in dfce0e4

    
           function read_field(io, field::VectorField{O, T}, page_list) where {O, T} 
        
               offset = read_field(io, field.offset_col, page_list) 
        
               content = read_field(io, field.content_col, page_list) 
        
               o = one(eltype(offset)) 
        
               jloffset = pushfirst!(offset .+ o, o) #change to 1-indexed, and add a 1 at the beginning 
        
               res = VectorOfVectors(content, jloffset, ArraysOfArrays.no_consistency_checks) 
        
               return res::_field_output_type(field) 
        
           end

I can think of 3 ways of doing this assuming we will make a copy (another option is to plug offset::Base.ReinterpretArray into VectorOfVectors but idk if that's a good idea):

julia> function f(ary, o)
           return pushfirst!(ary .+ o, o)
       end

julia> function g(ary, o)
           return append!([o], ary .+= o)
       end

julia> function h(ary, o)
           res = append!([o], ary)
           @views res[2:end] .+= o
           return res
       end

julia> @benchmark f(ary,o) setup = begin ary = reinterpret(Int32, rand(UInt8, 10^5*4)); o = Int32(1) end
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  37.401 μs … 672.151 μs  ┊ GC (min … max): 0.00% … 88.17%
 Time  (median):     49.064 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   53.250 μs ±  43.186 μs  ┊ GC (mean ± σ):  6.11% ±  6.97%

    ▃▅▅▆█▇▅▃▃▂▂▂▁                                              ▂
  ▃▄██████████████▇▇▆▅▅▄▃▃▃▄▃▁▁▄▁▁▁▃▁▃▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▃▅▄▃▃▄▆▆ █
  37.4 μs       Histogram: log(frequency) by time       141 μs <

 Memory estimate: 695.55 KiB, allocs estimate: 3.

julia> @benchmark g(ary,o) setup = begin ary = reinterpret(Int32, rand(UInt8, 10^5*4)); o = Int32(1) end
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  55.837 μs … 692.319 μs  ┊ GC (min … max): 0.00% … 75.43%
 Time  (median):     62.729 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   65.075 μs ±  28.550 μs  ┊ GC (mean ± σ):  2.06% ±  4.30%

          ▁▃▄▅▆▇███▇▆▄▂▂▂▂▂▁▂▁▁▁ ▁▁▁▂▁                         ▃
  ▃▁▁▃▁▃▅▇███████████████████████████████▇▇▇▆▇▆▆▅▇▆▇▆▇▆▅▅▆▅▄▅▆ █
  55.8 μs       Histogram: log(frequency) by time      81.9 μs <

 Memory estimate: 390.89 KiB, allocs estimate: 6.

julia> @benchmark h(ary,o) setup = begin ary = reinterpret(Int32, rand(UInt8, 10^5*4)); o = Int32(1) end
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  27.733 μs … 140.297 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     34.085 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   35.711 μs ±   6.306 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▃▇█▅▁
  ▂▂▁▂▂▂▃▃▄▅██████▆▅▅▄▄▄▄▄▄▃▃▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  27.7 μs         Histogram: frequency by time         54.6 μs <

 Memory estimate: 390.81 KiB, allocs estimate: 2.

The text was updated successfully, but these errors were encountered:

Moelf · 2023-02-26T16:08:20Z

tbh, I'm not sure why h is better than g

edit, probably:

in-place broadcast (e.g. .+=) significanctly slower for reinterpreted array JuliaLang/julia#48801

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Micro optimization for offset array #222

Micro optimization for offset array #222

Moelf commented Feb 26, 2023 •

edited

Moelf commented Feb 26, 2023 •

edited

Micro optimization for offset array #222

Micro optimization for offset array #222

Comments

Moelf commented Feb 26, 2023 • edited

Moelf commented Feb 26, 2023 • edited

Moelf commented Feb 26, 2023 •

edited

Moelf commented Feb 26, 2023 •

edited