Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken stacks traces on Go code #100

Open
cgaebel opened this issue Mar 30, 2022 · 3 comments
Open

Fix broken stacks traces on Go code #100

cgaebel opened this issue Mar 30, 2022 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@cgaebel
Copy link
Contributor

cgaebel commented Mar 30, 2022

In this simple Go example, it's clear that Go's stack switch causes stacktraces to wander off the right hand side of the screen. I think this is easy to fix: when trace_writer.ml, sees the symbol runtime.newstack, it should mark all currently-open stack frames as closed.

I'm not sure if there is any more custom control flow in Go code. e.g. do Go stacks shrink? Please file more bugs if you notice any.

@cgaebel cgaebel added bug Something isn't working good first issue Good for newcomers labels Mar 30, 2022
@prattmic
Copy link

Hi, I just discovered this project and it looks like an amazingly useful tool, and I'd love to get it working better with Go. I haven't tried running myself yet, but some initial notes below:

I think this is easy to fix: when trace_writer.ml, sees the symbol runtime.newstack, it should mark all currently-open stack frames as closed.

runtime.newstack is calling into the scheduler [1] (I think this is the truncated runt... bit under gopreempt_m on #98), which selects a new goroutine to run. The precise moment we switch to the new goroutine stack and start executing is in gogo, so I think this is the best point to mark stack frames as closed. Note the symbol is gogo, not runtime.gogo (runtime.gogo jumps to gogo).

I'm not sure if there is any more custom control flow in Go code. e.g. do Go stacks shrink?

  • Many places within the runtime switch from the "goroutine stack" to the "system stack", which is a fixed sized thread stack where do things that aren't safe on the goroutine stack. We switch back to the goroutine stack before returning to user code. morestack actually switches to the system stack before calling newstack, so it seems this is already working fine.
  • Signal handlers run on yet another "signal stack". This is set up with sigaltstack to apply automatically on signal delivery, so my guess is that it is already handled correctly.
  • Go stacks can shrink. This can either occur in newstack, or concurrently during the GC. In the GC case, we temporarily freeze the goroutine using our standard preemption mechanisms, copy the stack to a small stack, and then allow it to resume with the new SP. It is unclear to me if moving stacks (grow/shrink) will be problematic in and of themselves. Do you track actual SP values to determine when stack frames return?

[1] A bit more background: the newstack call in your example isn't actually growing the stack. Most functions contain a prologue that checks SP against the stack bounds to see if we need to grow the stack by calling newstack. For cooperative preemption, we poison the stack bounds so the prologue thinks it needs to grow the stack, but newstack does the more complex work do decide that no, we really just need to preempt.

@cgaebel
Copy link
Contributor Author

cgaebel commented Apr 22, 2022

Thanks for taking a look at this. It's nice to have someone who actually knows how Go works (as opposed to us) working on this.

To answer your question: magic-trace doesn't track or know anything about stack pointers. The stack frames are entirely inferred by the history of application's control flow (e.g. call, ret, jmp). From your description, it sounds like stack shrinking is uninteresting to magic-trace.

cgaebel added a commit to cgaebel/magic-trace that referenced this issue Apr 26, 2022
See the comment in trace_writer.ml for more details about what this
entails. The gist of what I've done here is review every call to
`gogo` within demo.go and make sure the stack returns to the right
spot.

Fixes janestreet#100
cgaebel added a commit to cgaebel/magic-trace that referenced this issue Apr 26, 2022
See the comment in trace_writer.ml for more details about what this
entails. The gist of what I've done here is review every call to
`gogo` within demo.go and make sure the stack returns to the right
spot.

Fixes janestreet#100
@cgaebel
Copy link
Contributor Author

cgaebel commented Apr 26, 2022

I've put up a draft of a PR that works on a very simple demo program. Thanks for the gogo hint, it was a huge help.

Could you please take a look at that PR and try that version of magic-trace on some interesting go programs you may have lying around?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants