codegen: use compiler trampoline closing over method-instance #26565

vtjnash · 2018-03-21T20:14:50Z

Uses a trampoline to reduce the number of jump checks to happen during dynamic dispatch. It used to be done this way initially, but that got lost for awhile after the function-types PR. This brings back the ability to directly call a method without going through the dispatch helper function.

(over)estimated performance jump is pretty good for certain cases: 17.31 ns => 13.82 ns

julia> @noinline f(a) = a
julia> g = []
julia> @noinline function unstable_no_alloc(n::Int)
                  for s = 1:n
                      f(g)
                  end
              end
@btime unstable_no_alloc(100_000)

JeffBezanson · 2018-03-21T20:48:31Z

src/julia.h

    uint8_t compile_traced; // if set will notify callback if this linfo is compiled
-    jl_fptr_t fptr; // jlcall entry point with api specified by jlcall_api
-    jl_fptr_t unspecialized_ducttape; // if template can't be compiled due to intrinsics, an un-inferred fptr may get stored here, jlcall_api = JL_API_GENERIC


Well, technically I just made generated functions more broken but just cared less about it. Although, as long as we make sure there's none included as part of basecompiler, I think it should not have any impact.

So you're saying I have to abandon my plans for the new optimizer to have a dependency on Cxx.jl? Drat!

I was just about to port codegen.cpp to LLVM.jl... /s

Don't let that stop you, @maleadt 😃

vtjnash · 2018-03-26T19:45:36Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

vtjnash · 2018-03-30T03:45:54Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2018-03-30T08:44:17Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

KristofferC · 2018-03-30T09:30:48Z

This should have been a pure merge commit benchmark but the memory regression indicates that #26435 (comment) is active in one of the benchmarks but not the other. I have seen similar things happen multiple times before. I think something is off with nanosoldier... @ararslan

vtjnash · 2018-03-30T15:58:26Z

pure merge commit benchmark

that's not exactly possible – there's nothing for this commit to merge into, it's already on top of master

KristofferC · 2018-03-30T16:09:30Z

Was that a side mark or did the point of my post not come across (that the baseline benchmark seems to be wrong).

ararslan · 2018-03-31T04:36:32Z

Nanosoldier lists the specific commits it's using in the report. Looking through Nanosoldier.jl, I don't see any reason to believe that the commits are being misreported, but maybe there's something I'm missing.

KristofferC · 2018-03-31T15:55:16Z

My thinking is that after #26435 (comment) (which the baseline should for sure include), the benchmark ["broadcast", "typeargs", "(\"tuple\", 10)"] started to allocate. In this PR, there is a Inf regression of memory allocation in that very same benchmark. An Inf memory regression is only possible if the benchmark doesn't allocate in the baseline but we know that it does. Therefore the baseline cannot be what we think it is. Am I missing something?

vtjnash · 2018-04-02T17:57:13Z

Maybe actually an issue with that PR? A more recent nanosoldier run on a different PR reports an infinite memory improvement on the same benchmark: #26628

KristofferC · 2018-04-02T18:49:38Z

I don't see how two PRs against master could result in one giving infinity memory improvement (guaranteeing that it used to allocate before the PR) and one giving infinity memory regression (guaranteeing that it used to not allocate before the PR).

Replaces jlcall_api with inspection of the invoke closure to see if it is a known entity

vtjnash · 2018-04-03T17:11:57Z

Let's just try that again. I've rebased so: @nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2018-04-03T22:14:12Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

vtjnash · 2018-04-04T17:44:09Z

I don't see any commonality between those two reports, so will merge soon.

StefanKarpinski · 2018-04-13T16:56:16Z

Unfortunately, this lovely feature is implicated as the worst offender in the #26767 compilation slow-down.

ararslan added performance Must go faster compiler:codegen Generation of LLVM IR and native code labels Mar 21, 2018

JeffBezanson reviewed Mar 21, 2018

View reviewed changes

vtjnash force-pushed the jn/trampoline branch 3 times, most recently from a3fc225 to 0ad4b05 Compare March 23, 2018 15:16

vtjnash force-pushed the jn/trampoline branch from 0ad4b05 to 5215da9 Compare March 30, 2018 03:44

codegen: use compiler trampoline closing over method-instance

a9cabb9

Replaces jlcall_api with inspection of the invoke closure to see if it is a known entity

vtjnash force-pushed the jn/trampoline branch from 5215da9 to a9cabb9 Compare April 3, 2018 17:08

vtjnash merged commit bce5bbe into master Apr 4, 2018

vtjnash deleted the jn/trampoline branch April 4, 2018 21:54

maleadt added a commit to JuliaGPU/CUDAnative.jl that referenced this pull request Apr 6, 2018

Fix fallout of JuliaLang/julia#26565.

5a8f448

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codegen: use compiler trampoline closing over method-instance #26565

codegen: use compiler trampoline closing over method-instance #26565

vtjnash commented Mar 21, 2018

JeffBezanson Mar 21, 2018

vtjnash Mar 21, 2018

Keno Mar 21, 2018

maleadt Mar 22, 2018

tshort Mar 22, 2018

vtjnash commented Mar 26, 2018

vtjnash commented Mar 30, 2018

nanosoldier commented Mar 30, 2018

KristofferC commented Mar 30, 2018 •

edited

vtjnash commented Mar 30, 2018

KristofferC commented Mar 30, 2018 •

edited

ararslan commented Mar 31, 2018

KristofferC commented Mar 31, 2018 •

edited

vtjnash commented Apr 2, 2018

KristofferC commented Apr 2, 2018

vtjnash commented Apr 3, 2018

nanosoldier commented Apr 3, 2018

vtjnash commented Apr 4, 2018

StefanKarpinski commented Apr 13, 2018 •

edited

codegen: use compiler trampoline closing over method-instance #26565

codegen: use compiler trampoline closing over method-instance #26565

Conversation

vtjnash commented Mar 21, 2018

JeffBezanson Mar 21, 2018

Choose a reason for hiding this comment

vtjnash Mar 21, 2018

Choose a reason for hiding this comment

Keno Mar 21, 2018

Choose a reason for hiding this comment

maleadt Mar 22, 2018

Choose a reason for hiding this comment

tshort Mar 22, 2018

Choose a reason for hiding this comment

vtjnash commented Mar 26, 2018

vtjnash commented Mar 30, 2018

nanosoldier commented Mar 30, 2018

KristofferC commented Mar 30, 2018 • edited

vtjnash commented Mar 30, 2018

KristofferC commented Mar 30, 2018 • edited

ararslan commented Mar 31, 2018

KristofferC commented Mar 31, 2018 • edited

vtjnash commented Apr 2, 2018

KristofferC commented Apr 2, 2018

vtjnash commented Apr 3, 2018

nanosoldier commented Apr 3, 2018

vtjnash commented Apr 4, 2018

StefanKarpinski commented Apr 13, 2018 • edited

KristofferC commented Mar 30, 2018 •

edited

KristofferC commented Mar 30, 2018 •

edited

KristofferC commented Mar 31, 2018 •

edited

StefanKarpinski commented Apr 13, 2018 •

edited