Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features are incorrectly detected, resulting in load errors on some older kernels #753

Open
Arignir opened this issue Aug 14, 2023 · 2 comments

Comments

@Arignir
Copy link

Arignir commented Aug 14, 2023

Hello,

I work with CO-RE C eBPFs using Aya on older systems and I came across an interesting bug.

While minimizing vmlinux.h to the very strict minimum necessary for my eBPFs, my eBPFs failed to load with an Argument too long (E2BIG) error on Debian 10 (Linux 4.19.0-24-amd64), and on this distribution/kernel only (everything was ok on Ubuntu 18 or anything more modern)

If you look at the source code of the bpf syscall, it checks its arguments early with bpf_check_uarg_tail_zero(). If the union bpf_attr given to the BPF syscall is bigger than the one known to the kernel and the extra-bytes aren't 0, -E2BIG is returned. This is to prevent userspace from relying on features that aren't yet provided by the kernel.

When looking at Aya's source code, I realized that this is exactly the problem I am facing.

On one side, Aya evaluates which features are available here. Especially, it assumes that BTF are available if the call to bpf(BPF_BTF_LOAD) succeeded. When loading the eBPF, it will fill the prog_btf_fd field of the struct attr if that check returned true.

And here's the issue: on Linux 4.19, bpf(BPF_BTF_LOAD) was already implemented but prog_btf_fd wasn't in union bpf_attr yet. Therefore, when loading the eBPF, aya first checks if BTF are supported, then loads the BTF, and finally sets prog_btf_fd to the returned file descriptor (see here). This is wrong, and where the bug originates.

A quick and easy fix would be to strengthen the check for BTF availability by also checking for bpf(BPF_PROG_LOAD) on top of the existing bpf(BPF_BTF_LOAD), but that doesn't fix the underlying issue.

Instead, aya should be much more strict about which fields of struct btf_attr it relies on. Especially, it shouldn't assume that if a field is available for one operation then another related field should also be available for another operation, nor should it assume that if a feature is detected/available then all related fields are also available unless they have all been tested. Note that as stated earlier, modern fields on older kernels aren't ignored, they make the syscall fail.

NOTE: I am working with a version a bit older than the main branch, but by looking at the source code it seems the bug is still there. If I'm wrong, I'm very sorry :D

@tamird
Copy link
Member

tamird commented Aug 23, 2023

Thanks for the detailed write-up! Would you be willing to contribute a fix? We've recently improved support for testing on multiple kernel versions, so if you write a test you'll be able to ensure it works (at least locally; not all the tests currently pass on older kernels so we only test on 6.1 in CI). cargo xtask integration-test vm --help should get you started.

@Arignir
Copy link
Author

Arignir commented Sep 6, 2023

A friend of mine, @roblabla, is willing to contribute a fix. We are using a workaround so it's low priority for us, not sure when the fix will land.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants