Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does eBPF work inside container-mode? #989

Open
charmoniumQ opened this issue Feb 12, 2024 · 2 comments
Open

Does eBPF work inside container-mode? #989

charmoniumQ opened this issue Feb 12, 2024 · 2 comments
Labels
container related to container mode

Comments

@charmoniumQ
Copy link
Contributor

Use case: I wanted to benchmark an application in a normal system and one with eBPF filter on kernel tracepoints. Is this possible in container-mode?

I wrote an eBPF/bpftrace program which works as a normal user through setuid magic outside the container, but it gives the following error if I run it with containerexec:

ERROR: tracepoint not found: syscalls:sys_enter_fork

I think that is actually a permission error. If bpftrace doesn't have the root ruid and euid, /sys/kernel/tracing will not show any tracepoints. Fakeroot doesn't cut it.

I'm by no means an expert in Linux namespaces, I think we would want to add an opt-in flag to benchexec that adds a mapping from root (uid=0) outside the container to root (uid=0) inside the container to /proc/$benchexec/uid_map. I can implement it on my own, but I wanted to hear if I am on the right path from someone who understands namespaces better.

@PhilippWendler PhilippWendler added the container related to container mode label Feb 12, 2024
@PhilippWendler
Copy link
Member

I don't know about eBPF. But if it requires full root, i.e., the same as being uid 0 outside the container, then it will not work.

If it requires only root inside the container (or some capability like CAP_SYS_ADMIN, then it may work with containerexec --root. If it is supposed to work inside containers but does not work even with containerexec --root, then we could investigate what it actually needs and what is preventing it from working.

If you know that it requires full root, giving root inside the container access to the full root outside the container using uid_map would technically work, but opens up problems.

Using uid_map would require to execute BenchExec as root. But it was written with the intention of running as a regular user, and in particular the containerization used by BenchExec assumes that. I do not know whether running BenchExec as root would keep its isolation promises or whether it would open up security holes.

Giving full root access to inside the container would of course completely eliminate any isolation promises.

So I am hesitant to consider this.

Are there no other solutions for you? For example, setup tracing outside the container and then run BenchExec?

@charmoniumQ
Copy link
Contributor Author

So I am hesitant to consider this.

Understood.

For example, setup tracing outside the container and then run BenchExec?

Yeah, I would just need to know the PID of the grandchild in the outside-of-BenchExec namespace (the PID inside BenchExec's namespace is always 2). I think I could change parent_setup_fn to take a kwarg specifying that pid. I will change ContainerExecutor and BaseExecutor to both pass a pid to parent_setup_fn, for consistencies sake. As in ContainerExecutor's case, BaseExecutor should wait for a byte signalling that the parent_setup_fn is complete before launching the tool. The pid will be passed to parent_setup_fn as a kwarg, so existing code may have to change a little, but they would be more future-proof if they soak up and ignore extra **kwargs.

What do you think of that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
container related to container mode
Development

No branches or pull requests

2 participants