Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inline function probing #3095

Merged
merged 1 commit into from May 14, 2024
Merged

Add inline function probing #3095

merged 1 commit into from May 14, 2024

Conversation

ttreyer
Copy link
Contributor

@ttreyer ttreyer commented Mar 28, 2024

Inserting a probe before the stack frame has been established can lead to issues when collecting the stack.
There might be missing frames. See the runtime test call.ustack_elf_symtable where test2 is missing.

When inserting the probe after the prologue of the function, the stack walker has a better context and won't miss an entry.

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@ttreyer ttreyer force-pushed the skip-prologue branch 2 times, most recently from fcef4fa to 974b393 Compare March 30, 2024 14:36
@ttreyer
Copy link
Contributor Author

ttreyer commented Mar 30, 2024

I had to change the ARG0_STACK from 8 to 16 for x86_64, since the probe stops after the push rbp instruction.
See System V ABI, page 21.

I still need to check the offset for other archs:

  • x86_64
  • arm
  • loongarch64
  • mips64
  • ppc64 (our numbers are correct according to the System V ABI, but I get different values in Godbolt)
  • riscv64 (RISCV ABI)
  • s390

@danobi
Copy link
Member

danobi commented Apr 1, 2024

Sweet! This is an excellent fix. Haven't had a chance to look at the code yet; hope to look at the base PR today/tomorrow

@ttreyer ttreyer changed the title Skip function prologue for uprobes Add inline function probing Apr 2, 2024
tests/runtime/dwarf Outdated Show resolved Hide resolved
src/dwarf_parser.h Outdated Show resolved Hide resolved
src/bpftrace.cpp Outdated Show resolved Hide resolved
@ttreyer ttreyer force-pushed the skip-prologue branch 6 times, most recently from 9ece794 to c2ee339 Compare April 4, 2024 17:08
@ttreyer
Copy link
Contributor Author

ttreyer commented Apr 5, 2024

Clang and GCC don't emit code with the same ABI: Godbolt. Looking at the dwarfdump, Clang's stack args are at the stack pointer + 16, but GCC stack args are at the stack pointer + 0. So I can't make the stack args test consistent between the two compilers. Is there an option or a function attribute that might force GCC to set the offset to 16?

We also discussed about sargX with @ajor and concluded it typically won't return the correct argument for other languages like Go. We also noticed that kprobe/kfunc have a maximum of 6 parameters, which fits in the AMD64 registers and the arguments never need to go on the stack. So we were wondering if sarg should remain in its current form.

In a later PR, we could ensure correct access to stack arguments by using the location expression emitted for the debugger.

@viktormalik
Copy link
Contributor

We also discussed about sargX with @ajor and concluded it typically won't return the correct argument for other languages like Go.

Yes, Go uses their own calling conventions and ignores System V ABI (on x86). I'm not sure what we can do about it other than detecting that a uprobe is inside a Go program and issuing a warning or disabling thesargX builtin completely for such a case.

We also noticed that kprobe/kfunc have a maximum of 6 parameters, which fits in the AMD64 registers and the arguments never need to go on the stack. So we were wondering if sarg should remain in its current form.

FYI, kernel fprobes (i.e. our kfunc) now support up to 12 arguments so sargX could make sense for those (though they can by accessed by name).

@ttreyer ttreyer force-pushed the skip-prologue branch 6 times, most recently from 5288112 to b8e90ce Compare April 15, 2024 14:34
@ttreyer ttreyer marked this pull request as draft April 15, 2024 15:00
@viktormalik
Copy link
Contributor

But, if the user doesn't have DWARF symbols for his binary, knows the ABI, and wants to see values of stack-passed args, why not let him do it, especially since we already have the feature implementated?

Because I think it's going to get broken by this PR and we have no way of testing it across architectures 😅

Ah, I somehow missed that. In that case, I'm all in for dropping the feature :-) Sorry for the confusion and thanks for clearing it up!

And even if it did work correctly, I don't think it's a particularly useful feature because of how much knowledge of the application's ABI it requires in order to use (it was originally intended for Golang were sarg0 corresponded to the first argument). Extending arg to read stack arguments after the register arguments might be more useful than sarg.

That sounds like a reasonable approach. Maybe we could open an issue to track that.

@ttreyer ttreyer force-pushed the skip-prologue branch 4 times, most recently from a80023e to 259688b Compare April 29, 2024 18:08
@ttreyer
Copy link
Contributor Author

ttreyer commented Apr 29, 2024

Ah, I somehow missed that. In that case, I'm all in for dropping the feature :-) Sorry for the confusion and thanks for clearing it up!

Following that, I marked sarg as deprecated and removed its tests, since this PR breaks them by skipping the function prologue.

Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'd be good to leave at least one or two of these in the new form to document/check that stack accesses don't regress?

CHANGELOG.md Outdated Show resolved Hide resolved
tests/runtime/dwarf Show resolved Hide resolved
Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought of one more changelog entry we should include: "Deprecated sarg".

Other than that, looks great and I think it's ready to merge!

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still prefer if we could avoid waiting the entire timeout for the few runtime tests but it's not a hard blocker for me.

LGTM otherwise, nice work!

tests/runtime/dwarf Outdated Show resolved Hide resolved
@ttreyer ttreyer force-pushed the skip-prologue branch 2 times, most recently from c202659 to a3caf64 Compare May 10, 2024 16:23
@viktormalik
Copy link
Contributor

@danobi @jordalgo do you want to have one more look? Otherwise we can merge.

PROG config = { probe_inline = 0 }
uprobe:./testprogs/inline_function:square { @count++ }
uretprobe:./testprogs/inline_function:main { exit() }
EXPECT @count: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess bpftrace initializes this map value to 1 even though it never gets called?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it should be initialized to 0. The test checks that only one (non-inlined) instance of square is hit (see the inline_function.c source).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it's because the last call to square is not inlined. Is that worth a comment in this file as well to clear up possible confusion?

Use the DebugInfo (DWARF) to locate every inlined instances of a
uprobe function and attach to all of them.
This behaviour is controled by the `probe_inline` config key.
Note that using this config disables access to the arguments, because we
cannot guarantee inlined function instances follow the System V ABI.

Also attach to uprobe functions after their prologue, once the stack
frame has been set, which makes collecting stack traces more reliable.
This behaviour is the new default for all binaries with DWARF.
Copy link
Contributor

@jordalgo jordalgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ajor ajor linked an issue May 13, 2024 that may be closed by this pull request
@ajor
Copy link
Member

ajor commented May 14, 2024

Let's go! 🚀

@ajor ajor merged commit 742f0cf into bpftrace:master May 14, 2024
18 checks passed
ajor added a commit to ajor/bpftrace2 that referenced this pull request May 15, 2024
Explicitly put numbers into an array on the stack instead of relying on
architecture-specific calling conventions for stack-passed parameters.

This removes the dependency on the stack pointer's value (which changed
as of bpftrace#3095) and makes the tests architecture independent.
ajor added a commit to ajor/bpftrace2 that referenced this pull request May 16, 2024
Explicitly put numbers into an array on the stack instead of relying on
architecture-specific calling conventions for stack-passed parameters.

This removes the dependency on the stack pointer's value (which changed
as of bpftrace#3095) and makes the tests architecture independent.
ajor added a commit that referenced this pull request May 16, 2024
Explicitly put numbers into an array on the stack instead of relying on
architecture-specific calling conventions for stack-passed parameters.

This removes the dependency on the stack pointer's value (which changed
as of #3095) and makes the tests architecture independent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ustack missing stack frame second from top
5 participants