Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve register allocation for calls with safepoint instruction #101596

Open
kotlarmilos opened this issue Apr 26, 2024 · 1 comment
Open

Improve register allocation for calls with safepoint instruction #101596

kotlarmilos opened this issue Apr 26, 2024 · 1 comment
Labels
area-Codegen-JIT-mono enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@kotlarmilos
Copy link
Member

kotlarmilos commented Apr 26, 2024

Description

Mono uses a linear scan register algorithm to assign registers to arguments, and IL locals. Normally this algorithm is a good tradeoff between speed and efficiency. When we were using premptive GC, it would do a good job of placing values into registers so that, for example, if there was an unconditional call and a local needed to live across that call, it would go into a callee-saved register.

The problem is that with cooperative and hybrid GC, every (recursive) function now has what looks like an unconditional call right after the prolog.

As a result a simple recursive factorial function ends up looking like this:

# prolog
mov x26, x0
# safepoint code
<gc_safe_point> # IR opcode for a safepoint; clobbers all caller-saved registers
# other code
...
# recursive function call
sub w0, w26, #0x1
bl gram_Fac__int_    # recursive call
... 
# rest of the function

The safepoint instruction early on entry causes us to shuffle all the arguments into callee-saved registers.

Normally we treat the gc_safe_point as an opaque call-like IR instruction. In the LLVM backened this is what we want - LLVM has its own safepoint lowering pass that is aware that this call is extremely unlikely and allocates registers accordingly.

In the non-LLVM backends, however, this opcode persists all the way into the arch-specific backends where it gets replaced by, essentially:

<if global_gc_flag is unset, jump to continue_label:>
call runtime_gc_safepoint_icall
continue_label: nop

As a result, linear scan sees an unconditional call, but in reality it's very unlikely that we actually do a call here.

The idea is that we should add a lowering pass that replaces gc_safe_point by a conditional branch (that is marked unlikely) and a call earlier - before register allocation - if we're targeting a non-LLVM backend. This would allow linear scan to weight the call accordingly and hopefully keep function arguments in caller-saved registers.

We might not want to do it to every gc safepoint. For example, the ones on back branches might be ok to keep as a single opcode. (So we might for example add a new decomposable_gc_safepoint opcode and only replace that one by jump. then only place it in the prolog, not in back branches)

Copy link
Contributor

Tagging subscribers to this area: @lambdageek, @steveisok
See info in area-owners.md if you want to be subscribed.

@lambdageek lambdageek added the enhancement Product code improvement that does NOT require public API changes/additions label Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Codegen-JIT-mono enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

2 participants