Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aot: Enable aot built bpftrace programs to run on their own #3104

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

arthurshau
Copy link
Contributor

@arthurshau arthurshau commented Apr 6, 2024

The first commit fixes a couple issues with aot built binaries that was preventing runtime from functioning.

We now look in the right place in the ELF file for the custom .btaot section that contains the metadata + generated ELF, that we then restore to create BpfBytecode. Second, we now also serialize the map of BpfMaps owned by BpfBytecode into the bytecode as well, which is necessary for runtime. In order to do this, I needed to add a default constructor to BpfMap, as well as a hacky solution to be able to serialize the name of the map which is a cstring_view (which may or may not have defeated the purpose of having the cstring_view in the first place - would appreciate a review of this). Finally, I fixed maps not being printed out correctly when bpftrace exits.

The second commit allows running aot binaries directly without calling having to call runtime shim (e.g. you can now just call ./a.btaot instead of bpftrace-aotrt a.btaot. This is possible because the aot binaries already contain all the info necessary to run themselves.

Address part 1 of #2918.

Example:

$ ./bpftrace -e 'profile:hz:599 { @[tid] = count(); exit(); }' --aot a.btaot
$ ./a.btaot
Attaching 1 probe...


@[679916]: 1
@[12686]: 1
@[680061]: 1
@[0]: 4
Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

src/bpfmap.cpp Fixed Show fixed Hide fixed
@arthurshau arthurshau changed the title aot: Enable aot built binaries to run on their own aot: Enable aot built bpftrace programs to run on their own Apr 6, 2024
Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks reasonable. i'm assuming this involved some nice debugging work :)

src/aot/aot.cpp Outdated Show resolved Hide resolved
src/aot/aot.cpp Outdated Show resolved Hide resolved
src/bpfmap.h Outdated Show resolved Hide resolved
src/bpfmap.h Outdated Show resolved Hide resolved
src/aot/aot_main.cpp Outdated Show resolved Hide resolved
src/aot/aot_main.cpp Outdated Show resolved Hide resolved
src/container/cstring_view.h Outdated Show resolved Hide resolved
src/aot/aot.cpp Fixed Show fixed Hide fixed
@arthurshau
Copy link
Contributor Author

Addressed comments, thanks!

@arthurshau arthurshau marked this pull request as ready for review April 10, 2024 20:02
src/aot/aot.cpp Outdated Show resolved Hide resolved
src/aot/aot.cpp Outdated
Elf_Data *data = NULL;
uint8_t *btaot_section = NULL;
const Header *hdr;
int i = 0;
Copy link
Contributor

@jordalgo jordalgo Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can we move this closer to the while loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I put it here (along with the others) because of the goto statements bypassing initialization errors. But I can split the initialization onto a second line to make it work.

src/aot/aot.cpp Outdated Show resolved Hide resolved
src/aot/aot_main.cpp Outdated Show resolved Hide resolved
@arthurshau
Copy link
Contributor Author

Address comments

src/aot/aot_main.cpp Outdated Show resolved Hide resolved
src/aot/aot.h Outdated Show resolved Hide resolved
src/aot/aot_main.cpp Outdated Show resolved Hide resolved
src/bpfmap.h Outdated Show resolved Hide resolved
Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we can't store a cstring_view in the serialised data now - just got a couple of other places that we can still reduce string copying in

src/bpfmap.h Outdated Show resolved Hide resolved
src/bpfmap.h Outdated Show resolved Hide resolved
Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good to me, just a couple of minor points.

What's the testing situation looking like for AOT? It'd be good to have it exercised by the runtime test suite to ensure it doesn't break again

src/run_bpftrace.h Outdated Show resolved Hide resolved
src/run_bpftrace.cpp Outdated Show resolved Hide resolved
@arthurshau
Copy link
Contributor Author

Address comments and rebase.

What's the testing situation looking like for AOT? It'd be good to have it exercised by the runtime test suite to ensure it doesn't break again

The immediate next order of business to enable this PoC will be to get runtime-tests.sh --run-aot-tests to completely pass. As noted in #2918, this may require some work in portability_analyser.cpp to determine which features are/are not supported for AOT at the moment.

Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@arthurshau
Copy link
Contributor Author

Maybe also for a follow-up diff but the generated binary still seems rather large

How do we cut down on the file size?

Yeah, one of the items on the list in #2918 is to re-remove the large list of runtime deps from the shim, which should hopefully help a lot.

Are we going to ship bpftrace-aotrt with the normal bpftrace binary for all distros?

I believe bpftrace-aotrt is already getting shipped? At least, I found it in my /usr/bin/ alongside my normal bpftrace install. Currently, the code searches for bpftrace-aotrt in $PATH to copy the shim, so it will need to exist somewhere.

General question: With this PR, is this feature ready enough to add to the --help output?

At what point do we add documentation?

Probably not ready to add to --help yet, ideally we should get runtime-tests.sh --run-aot-tests to pass first I think.

src/aot/aot.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, sorry for the delayed review! I really like this work and hope that we'll be able to get it in soon. With that, I apologize if the following will cause any delays but I think that it's an important thing to mention.

I think that there's a major flaw in the way we serialize bytecode at the moment. The current workflow is:

  1. codegen -> ELF -> BpfBytecode -> serialization to .btaot
  2. deserialization from .btaot -> BpfBytecode -> load BPF programs

With the upcoming move to libbpf, this will not let us leverage the full power of libbpf which is able to deduce a lot of information about the BPF maps and programs from the ELF. I think that the correct workflow should be:

  1. codegen -> ELF -> serialization to .btaot
  2. deserialization from .btaot -> ELF -> BpfBytecode -> load BPF programs

This is also the reason why you had to serialize BpfMap which I specifically tried to avoid when I was implementing it. Ideally, BpfMap instances should be created by running libbpf parsing on the ELF produced by bpftrace.

In addition, I plan to drop BpfBytecode.sections_ completely and use libbpf's struct bpf_object instead (which is created from the ELF). When that happens, the current AOT approach would stop working and we'd have to move to the workflow suggested above anyways.

So, I suggest we move to serializing the produced ELF instead of BpfBytecode before merging this PR.

@arthurshau
Copy link
Contributor Author

Hi Viktor, thanks for the comment. You make good points, and I forgot to consider your work moving bpftrace to use libbpf when opening this PR. Given what you said about removing BpfBytecode.sections_, I agree that it makes sense to switch the AOT workflow now so we can leverage libbpf. I'll look into it.

@viktormalik
Copy link
Contributor

Hi Viktor, thanks for the comment. You make good points, and I forgot to consider your work moving bpftrace to use libbpf when opening this PR. Given what you said about removing BpfBytecode.sections_, I agree that it makes sense to switch the AOT workflow now so we can leverage libbpf. I'll look into it.

Thanks @arthurshau! I think that it could be as simple as serializing the ELF produced by codegen (as raw binary data) and moving creation of BpfBytecode to run_bpftrace.

@arthurshau
Copy link
Contributor Author

arthurshau commented May 21, 2024

@viktormalik Sorry for the delay, I've implemented the requested changes. Can you double check that the serialization portion in particular is done correctly? I basically just call emit and then append the vector to the end after the AOT header and RequiredResources.

Also, when running the generated binary, I'm now getting these libbpf warnings for some reason:

libbpf: elf: skipping unrecognized data section(15) .eh_frame
libbpf: elf: skipping relo section(16) .rel.eh_frame for section(15) .eh_frame

This happens when calling elf::parseBpfBytecodeFromElfObject. Might you know why?

@ajor
Copy link
Member

ajor commented May 21, 2024

Just had a quick skim, but we should try and keep the ELF file in memory until the actual serialisation step.

CodegenLLVM::emit_elf is just a tool for debugging bpftrace. Take a look at CodegenLLVM::emit to see how we can operate on the ELF file in memory instead of writing it to disk.

src/main.cpp Fixed Show fixed Hide fixed
src/main.cpp Fixed Show fixed Hide fixed
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small remarks, otherwise looks great. Thanks for this work!

src/run_bpftrace.h Outdated Show resolved Hide resolved
src/aot/aot.cpp Show resolved Hide resolved
src/aot/aot.cpp Outdated Show resolved Hide resolved
src/aot/aot.cpp Outdated Show resolved Hide resolved
@arthurshau
Copy link
Contributor Author

Addressed comments, thanks! Also moved the libbpf_print and libbpf_print_level_string functions from main.cpp to the shared run_bpftrace.cpp because aot_main.cpp needs them as well - it wasn't properly setting the log callback function for libbpf warnings.

Fixes a couple issues with aot built binaries that was preventing
runtime from functioning.

Namely, we were looking in the wrong place in the ELF file for the
custom .btaot section that we added. We were also not serializing the
map of BpfMaps contained in BpfBytecode, which is necessary for runtime.
Finally, I fixed maps not being printed out correctly when bpftrace exits.
Currently, even though we make a copy of the runtime shim and then
embed the necessary metadata + bytecode into the binary, you still need
to pass that binary as an argument to the runtime shim in order to run
it (i.e. you need to do 'bpftrace-aotrt <file_name>'). Since the binary
can fully function on its own, we don't need to do this anymore.

This commit makes the executable just look inside itself for the
metadata and bytecode, skipping the need to call bpftrace-aotrt. (i.e.
you can now run the binary with just ./<file_name>.)
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'm just not approving the PR, yet, as I'd like to take some time trying this out.

@viktormalik
Copy link
Contributor

Also moved the libbpf_print and libbpf_print_level_string functions from main.cpp to the shared run_bpftrace.cpp because aot_main.cpp needs them as well - it wasn't properly setting the log callback function for libbpf warnings.

Yeah, makes sense.

@viktormalik Sorry for the delay, I've implemented the requested changes. Can you double check that the serialization portion in particular is done correctly? I basically just call emit and then append the vector to the end after the AOT header and RequiredResources.

Yes, that should be correct.

Also, when running the generated binary, I'm now getting these libbpf warnings for some reason:

libbpf: elf: skipping unrecognized data section(15) .eh_frame
libbpf: elf: skipping relo section(16) .rel.eh_frame for section(15) .eh_frame

This happens when calling elf::parseBpfBytecodeFromElfObject. Might you know why?

I'm not sure why this is happening but I'm seeing this when running in non-AOT mode, too. It's certainly something that we should address but it shouldn't be caused by this PR.

@danobi
Copy link
Member

danobi commented May 23, 2024

🚀

Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic probes work nice, some others don't but since this is still undocumented, I'm ok with merging this as-is and fixing things in the follow-up work.

My remarks:

  1. Using --aot searches $PATH for bpftrace-aotrt which is not very convenient when doing development. Maybe we could make it also (recursively) search $PWD (or the directory of the bpftrace binary) so that developers don't have to update their $PATH manually?
  2. BTF-based probes (e.g. kfunc) currently do not work b/c the AOT main doesn't run bpftrace.parse_btf(). Here, it'll be tricky to support kernel modules as in normal mode, we parse the probes to determine which kernel modules BTF needs to be parsed. Since we don't have that information in pre-compiled binary, we'll have to figure out a different way to get the list of modules that the program may attach to.
  3. Since this will likely get merged shortly before we do the new release, I just want to make sure that we agree that we're leaving this undocumented for now.

@arthurshau
Copy link
Contributor Author

  1. Since this will likely get merged shortly before we do the new release, I just want to make sure that we agree that we're leaving this undocumented for now.

Sounds good, I agree with that.

@danobi
Copy link
Member

danobi commented May 24, 2024

Using --aot searches $PATH for bpftrace-aotrt which is not very convenient when doing development. Maybe we could make it also (recursively) search $PWD (or the directory of the bpftrace binary) so that developers don't have to update their $PATH manually?

Yeah, need to find a better way.

Since this will likely get merged shortly before we do the new release, I just want to make sure that we agree that we're leaving this undocumented for now.

Agreed. We don't wanna document / announce anything until at least #2918 is completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants