Show mean time in `@btime` #258

mcabbott · 2021-09-18T15:56:39Z

There is room to show a bit more information with @btime, and I think the next number to show after the minimum is the mean. So this PR does this:

julia> @btime sum(log, $(rand(1000)));
  min 3.693 μs, mean 3.721 μs (0 allocations)

julia> @btime sum(log, rand(1000));
  min 4.429 μs, mean 5.242 μs (1 allocation, 7.94 KiB. GC mean 282 ns, 5.38%)  # initial version
  min 4.506 μs, mean 5.478 μs (1 allocation, 7.94 KiB)                         # minimal version

Before:

julia> @btime sum(log, $(rand(1000)));
  5.285 μs (0 allocations: 0 bytes)

Notice that it also doesn't bother writing "0 bytes" if there are no allocations, in the name of reducing clutter. I think that's safe but if anyone has a counter-example speak up.

I'm less sure this is a good idea, but it also prints a the mean GC time, which I hope is likely to be the most useful one. (Often the fastest run will not trigger this.) I think it's less confusing to show the time than just the percentage -- I hope it is clear that this is the mean of the GC times, as a percentage of the mean time shown. Not the mean GC percentage.

Xref #217 for ongoing discussion about @benchmark's histograms.

Edit: for comparison, here's Base.@time's minimal and maximal display:

julia> let x = rand(1000)
         @time sum(log, x)
       end;
  0.000012 seconds
 
julia> f(n) = prod(Any[sum(randn(1000,1000)) for _ in 1:n]);

julia> @time f(10)
  0.089296 seconds (990 allocations: 76.352 MiB, 70.85% gc time, 1.80% compilation time)
1.357876781727917e26

This also skips 0 allocations when zero, and prints GC time + compilation time when these are nonzero. It's longer than the longest @btime above. I suppose this is also an argument for showing GC time as a percentage, which could also save some space.

Edit': perhaps this wants a screenshot. A variant with bold times & only GC percentage:

Notice that the minimum times are in the wrong order, compared to the mean. And also that the GC percentages greatly understate the average cost of allocations.

MasonProtter · 2021-09-18T17:47:09Z

One thing I really like about this format is that most users don’t realize when they use @btime that it’s reporting a minimum. Most assume it’s an average or median time.

mcabbott · 2021-09-18T20:17:53Z

It could also say min 1 allocation, 7.94 KiB since that's what it is, but maybe it's usually stable enough for this not to matter.

With everything now, the line is usually about 82 characters. Do we care about getting it under 80? It's not super-confusing if line wrapped. (The zero-allocation case is more like 45 characters, perfect for slack...)

mcabbott · 2021-09-18T20:47:57Z

src/execution.jl

-                $BenchmarkTools.prettymemory($BenchmarkTools.memory($trialmin)), ")")
+        $print("  min ", $BenchmarkTools.prettytime($BenchmarkTools.time($trialmin)),


Is there a reason to interpolate $BenchmarkTools.prettymemory not just $prettymemory? I see I left some and later dropped them.

mcabbott · 2021-09-18T20:48:41Z

src/execution.jl

-    trialmin, trialallocs = gensym(), gensym()
+    trialmin, trialmean, trialallocs = gensym(), gensym(), gensym()


Is there a reason not to write @gensym trialmin trialallocs etc. here?

KristofferC · 2021-09-21T08:14:54Z

First they came for @benchmark...

I think there is great value in having @btime only show the number which is in most cases useful which is the minimum. For more data, there is benchmark.

mcabbott · 2021-09-21T11:58:03Z

This is a fair point. And there's something to be said for matching @time (although that now adds compilation time, too). I'd also like to cut down @benchmark a bit, to make the step up smaller than it is now.

Can I persuade anyone to try out this branch in real life for a bit, to see if they like or hate it in actual use?

KristofferC · 2021-10-03T13:23:31Z

The issue is that for cases where you would use @btime over @benchmark the mean is a worse statistic than the mean. If you want to show extra stuff, perhaps the mode could be shown, but the more you measure the more the mode will go towards the minimum anyway.

mcabbott · 2021-10-03T15:18:14Z

Isn't the mean what's relevant for throughput? I guess this PR takes the position that it's the most useful second number. For example, with the allocating sum(x .* y) above, the GC time seems to be an under-estimate of the penalty: 16%, while almost 60% of the mean time is absent from the minimum. This seems relevant for micro-benchmarking a function you hope to use in a loop.

The second number @time prints is GC percentage, from just one run of course. I would argue that showing the percentage of the mean without showing the mean is weird (and usually GC does not run at all on the min run). And that the mean time is more useful anyway -- it's easier to have all numbers referring to the total time. If we want to keep the length shorter, deleting this would help.

The mode seems tricky, on what are essentially real numbers. Should they be binned first, or can we trust the clock to do that well enough?

julia> b2 = @benchmark sum(log.(rand(10^4)));
    
julia> length(b2.times)
10000

julia> diff(sort(unique(b2.times)))'
1×766 adjoint(::Vector{Float64}) with eltype Float64:
 41.0  84.0  41.0  1.0  41.0  42.0  41.0  1.0  41.0  1.0  41.0  41.0  1.0  …  174917.0  63750.0  17416.0  3417.0  709.0  499.0  42834.0  300417.0

vs another machine

julia> diff(sort(unique(b2.times)))'
1×4452 adjoint(::Vector{Float64}) with eltype Float64:
 268.0  108.0  15.0  14.0  4.0  59.0  17.0  7.0  4.0  3.0  28.0  17.0  …  3656.0  68996.0  33322.0  4102.0  6553.0  18376.0  2.22808e6  562661.0

This and the median might be contenders for the best one number. But, as you say, these will often approach the minimum... which @btime already prints, and is the easiest to explain, so this PR doesn't propose to change.

stevengj · 2021-10-20T13:10:52Z

Isn't the mean what's relevant for throughput?

The problem is that it's noisy and unreliable; if something else happens to run on your system that temporarily interrupts the julia process, you can get a huge spike in time that will throw off the mean completely.

Basically, what source of timing variance are you interested in averaging? In many cases the variance is due to things outside of Julia, which makes it of questionable utility in benchmarking the performance of Julia code per se. I agree that there are some cases where timing variance arises from Julia itself— for example, due to garbage collection, or perhaps your algorithm itself is stochastic. In the latter cases, however, you need to pay much closer attention to statistics and probably @benchmark is more useful.

mcabbott · 2021-10-20T13:22:54Z

Nice, somehow I never looked at the paper.

I guess garbage collection is the major thing I had in mind, since it's very common to have different implementations with different allocations. The reported GC percentage is quite a large underestimate of how much time this is going to cost. This is a rare event, so something like mean(5-to-95th) will miss it.

Agree this measure will be noisier than min, but the claim here is only that it's a sufficiently useful second number to be worth printing. If it's huge, that's a warning to investigate more closely.

Is there another number which might better serve this purpose?

KristofferC · 2021-10-20T13:31:32Z

Ref #37 (comment). @btime is perfect for most use cases where you would want to use BenchmarkTools and I am kindly asking not to change it. Feel free to add a @bmean or modify @benchmark or whatever, just leave @btime alone, please.

stevengj · 2021-10-20T17:07:18Z

I guess garbage collection is the major thing I had in mind

A problem with benchmarking this is that it is strongly context dependent — how often gc occurs depends not just on the function you are benchmarking, but whether it is called in a context that also does allocation. (gc times aren't additive!) This is especially problematic here because @btime and @benchmark are usually employed for micro-benchmarking small pieces of a larger computation.

That's not to say that benchmarking gc is unimportant! But, in combination with the difficulty of eliminating external noise, my sense is that the mean will mislead more that it informs in the @btime output. If you need more detailed statistics, @benchmark is what you want.

mcabbott · 2021-10-20T17:13:16Z

more detailed statistics, @benchmark is what you want

But which bit -- what would you take from that as a better (if still imperfect) measure of the cost of allocations? It shows min/median/mean/std/max. And the histogram stops at the 99th percentile, which IIRC typically omits GC.

Expanding the above example:

And if I include these functions as part of some larger long-running loop, then it seems the mean doesn't do badly as a prediction of how long the overall calculation will take, unlike the minimum:

julia> for N in [10,100,10^4,10^5]
         @show N
         x=randn(N); y=randn(N);
         @btime dot($x, $y)
         @btime sum($x .* $y)
       end;  # shown above

julia> pairs1(A,B) = [dot(x,y) for x in eachcol(A), y in eachcol(B)];

julia> pairs2(A,B) = [sum(x .* y) for x in eachcol(A), y in eachcol(B)];

julia> @btime pairs1(A,B) setup=(N=10^4; A=rand(N,1000); B=rand(N,1000));
  min 4.842 s, mean 4.842 s (2 allocations, 7.63 MiB)

julia> @btime pairs2(A,B) setup=(N=10^4; A=rand(N,1000); B=rand(N,1000));
  min 11.087 s, mean 11.087 s (2000002 allocations, 74.56 GiB. GC mean 23.56%)

KristofferC · 2021-12-08T20:11:10Z

And if I include these functions as part of some larger long-running loop, then it seems the mean doesn't do badly as a prediction of how long the overall calculation will take, unlike the minimum:

It is because you are not running a representative benchmark. You are doing the equivalent of

 @btime sin.(rand(10^4:10^5))

and then saying that the minimum is a bad statistic for giving the mean value when running that code multiple times.

Mean is bad. Every single noise of your computer, every latency based on the OS scheduler, etc contributes to the mean. Don't take it from me, take it from this dude https://youtu.be/vrfYLlR8X8k?t=1487 (the whole talk is worth looking at).

Again, make a @bmean or something, leave @btime alone.

chriselrod · 2022-02-05T02:14:30Z

I want this.

I'm perfectly fine with naming it @bmean or something else to preserve @btime.

KristofferC · 2022-02-05T06:46:49Z

Yes, call it something else.

chriselrod · 2022-02-05T21:51:46Z

Yes, call it something else.

Are you happy to rename it? Seems easy enough to make everyone happy this way.

gdalle · 2023-09-18T17:25:39Z

@mcabbott doing some cleanup before a 2.0 release, would you wanna rename this @bmean and try to get it merged?

mcabbott commented Sep 18, 2021

View reviewed changes

mcabbott mentioned this pull request Sep 22, 2021

Overhaul display of a Trial #217

Merged

mcabbott mentioned this pull request Dec 5, 2021

Re-organise @benchmark display #260

Open

mcabbott force-pushed the btime_mean branch from 1053204 to cb8544e Compare December 5, 2021 19:50

mcabbott added 7 commits December 8, 2021 14:13

show mean in at-btime

4ab0b9e

add GC time

e557aab

docstring + tests

2188865

better it for GC, skip time just percentage

6a18704

bold times

6f98e92

maybe delete GC percentage to keep it short?

1666701

docs

6af30cf

mcabbott force-pushed the btime_mean branch from cb8544e to 6af30cf Compare December 8, 2021 19:13

mcabbott mentioned this pull request Dec 22, 2021

Move bounds checks on copyto!(dst, n, src) JuliaLang/julia#43517

Merged

gdalle added the enhancement label Sep 18, 2023

gdalle mentioned this pull request Oct 23, 2023

Default to mean time, when for allocations #340

Draft

mcabbott mentioned this pull request Apr 11, 2024

Add in-place destructure! FluxML/Optimisers.jl#165

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show mean time in `@btime` #258

Show mean time in `@btime` #258

mcabbott commented Sep 18, 2021 •

edited

MasonProtter commented Sep 18, 2021 •

edited

mcabbott commented Sep 18, 2021 •

edited

mcabbott Sep 18, 2021

mcabbott Sep 18, 2021

KristofferC commented Sep 21, 2021

mcabbott commented Sep 21, 2021

KristofferC commented Oct 3, 2021

mcabbott commented Oct 3, 2021 •

edited

stevengj commented Oct 20, 2021 •

edited

mcabbott commented Oct 20, 2021

KristofferC commented Oct 20, 2021

stevengj commented Oct 20, 2021 •

edited

mcabbott commented Oct 20, 2021 •

edited

KristofferC commented Dec 8, 2021

chriselrod commented Feb 5, 2022

KristofferC commented Feb 5, 2022

chriselrod commented Feb 5, 2022

gdalle commented Sep 18, 2023

		$BenchmarkTools.prettymemory($BenchmarkTools.memory($trialmin)), ")")
		$print(" min ", $BenchmarkTools.prettytime($BenchmarkTools.time($trialmin)),

		trialmin, trialallocs = gensym(), gensym()
		trialmin, trialmean, trialallocs = gensym(), gensym(), gensym()

Show mean time in @btime #258

Are you sure you want to change the base?

Show mean time in @btime #258

Conversation

mcabbott commented Sep 18, 2021 • edited

MasonProtter commented Sep 18, 2021 • edited

mcabbott commented Sep 18, 2021 • edited

mcabbott Sep 18, 2021

Choose a reason for hiding this comment

mcabbott Sep 18, 2021

Choose a reason for hiding this comment

KristofferC commented Sep 21, 2021

mcabbott commented Sep 21, 2021

KristofferC commented Oct 3, 2021

mcabbott commented Oct 3, 2021 • edited

stevengj commented Oct 20, 2021 • edited

mcabbott commented Oct 20, 2021

KristofferC commented Oct 20, 2021

stevengj commented Oct 20, 2021 • edited

mcabbott commented Oct 20, 2021 • edited

KristofferC commented Dec 8, 2021

chriselrod commented Feb 5, 2022

KristofferC commented Feb 5, 2022

chriselrod commented Feb 5, 2022

gdalle commented Sep 18, 2023

Show mean time in `@btime` #258

Show mean time in `@btime` #258

mcabbott commented Sep 18, 2021 •

edited

MasonProtter commented Sep 18, 2021 •

edited

mcabbott commented Sep 18, 2021 •

edited

mcabbott commented Oct 3, 2021 •

edited

stevengj commented Oct 20, 2021 •

edited

stevengj commented Oct 20, 2021 •

edited

mcabbott commented Oct 20, 2021 •

edited