ability to generate stack maps for spill values to aid debuggers & gcs #162

MichaelRFairhurst · 2017-03-13T00:09:24Z

Based on discussion in gitter, summarized and with additional research: found a good reference implementation in the Dart VM.

https://github.com/dart-lang/sdk/blob/a1f784e643b98c6b7fbcd5d409f8faa7e33a5c9a/runtime/vm/stack_frame.cc#L110

You can see here that during gc (and probably debugging) the return address is used to find a StackMap for points inside a function. Each spill slot on the stack (as well as some other things stack points to the dart VM) are a bit in the map, if 1 that means that spill slot is an object.

This allows tagged pointers/puntagged values/etc be spilled in unboxed forms, and the gc (or debugger) can still distinguish primitives from pointers. Without this, for gc at least, only a conservative approach can be used. This requires a special allocator that indexes its allocations, and means objects cannot be moved during GC because conservative GC can mistake primitives for pointers.

Then when code is compiled it must track metadata related to spill registers and save them, and that data must be indexed against the location of the code in memory (there are many ways to do this, you could imagine a simple binary search, or placing a pointer to the stack map in the stack itself).

https://github.com/dart-lang/sdk/blob/a1f784e643b98c6b7fbcd5d409f8faa7e33a5c9a/runtime/vm/flow_graph_compiler.cc#L742

Dart does this only at safe points, since it is threaded.

For the sake of GC, a simple binary will do, but you could imagine all types of data being trackable in this way (for instance, a debugger still wouldn't be able to tell apart doubles from ints).

From an API standpoint, I could imagine attaching custom data to virtual registers. A function should be available on X86Compiler to track spill slots at that point in the code, maybe with an option for tracking all changes in spill usages. Then upon finalizing a function, a vector of instruction offsets with metadata per spill slot could be returned. When the function is moved to executable memory, those offsets could be adjusted by the returned function address, indexed via the user's own algorithm, and used in things like GC. For people who want to store the stack map in the stack itself, hopefully they could use the secondary pass code to do that after the map is generated.

Some alternate options that may be simpler:

an API for how to spill and load values, that way all spilled values can be boxed
an API for choosing which virtual registers get spilled where. That may allow users to say, store tagged values in the lower spill slots, untagged values in the upper ones, and record the counts of each.

Originally I thought this was a sort of far off need for me, because I only thought I needed it to relocated data during GC, which probably isn't super important for me. But I realized that conservative GC requires indexing allocations (often with a special allocator), which is a very hard thing to make performant. I'm at the moment going to do just about the worst thing possible and store all GCed addresses in an unordered_set so I can use regular malloc. If this feature is really really far out I can find a better approach, but ideally I can do what I'm doing for the next few months and then switch to stack maps before I'm production ready.

kobalicek added the enhancement label Mar 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ability to generate stack maps for spill values to aid debuggers & gcs #162

ability to generate stack maps for spill values to aid debuggers & gcs #162

MichaelRFairhurst commented Mar 13, 2017

ability to generate stack maps for spill values to aid debuggers & gcs #162

ability to generate stack maps for spill values to aid debuggers & gcs #162

Comments

MichaelRFairhurst commented Mar 13, 2017