Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT Proposal] bcc / tools repo evolution #3976

Open
davemarchevsky opened this issue May 4, 2022 · 7 comments
Open

[DRAFT Proposal] bcc / tools repo evolution #3976

davemarchevsky opened this issue May 4, 2022 · 7 comments

Comments

@davemarchevsky
Copy link
Collaborator

davemarchevsky commented May 4, 2022

Status: Brainstorming, no decisions made. Community input very welcome.

Summarizing a discussion between @yonghong-song , @brendangregg and I. A few observations about the current state of things were made:

  1. The bcc repo contains bcc the framework, libbpf tools, and bcc tools. Folks who come to the repo to use the tools may think that bcc the framework is a recommended way to write BPF programs in 2022.

  2. bcc and libbpf-tools are maintained by folks who aren't subject matter experts, nor power users of most tools. As a result we don't do a good job keeping tools up-to-date with changes in the subsystems they're tracing. Furthermore, we tend to accept contributions which are adding functionality without much pushback, which will result in the tools becoming a messy 'dumping ground' in the long term.

  3. It's nice to have a central repository of tools for a few reasons:

    • discoverability of "tools that dig into X" for folks just beginning to dive into BPF observability
    • easy to find "practical use of BPF feature Y" for folks writing their own programs
    • for core BPF developers, provides a corpus of real-world BPF programs to analyze (identify common patterns, see how proposed changes will affect tools, etc.)

To address (1) and (2) without breaking (3), a proposal:

To keep nice properties of (3), let's keep bcc and libbpf tools in iovisor/bcc repo. To clarify (1), let's move bcc the framework into a separate repo (iovisor/bcc-framework or similar). Let's also make it clear that bcc-framework should not be used for new prog development unless the program writer has a good reason, and encourage the program writer to reach out to us if they do have such a reason, so we can improve libbpf ecosystem.

For (2), let's adopt a toolkit vs toolshed distinction for tools. Tools in the toolkit will be actively maintained, ideally by an opinionated power user of the tool or someone familiar with the kernel bits the tool is tracing. Users of such tools can expect the tool to work on a reasonable variety of kernels and output meaningful data usable for production analysis.

The toolshed, on the other hand, will contain tools which are not actively maintained and thus may not work at all or, if they do work, may not output correct or meaningful data, as kernel implementations of whatever they're tracing may have shifted over the years. These are not to be considered prod-ready, but can make their way into the toolkit if someone finds them useful enough to polish up and maintain.

Toolkit vs toolshed distinction is inspired by Brendan's experience with DTraceToolkit [0].

Thoughts? Comments? Brendan, Yonghong, please feel free to correct any of this if I'm misrepresenting our convo.

[0]: https://www.brendangregg.com/blog/2013-09-05/dtracetoolkit-0xx-mistakes.html . Specifically "Mistake 2. Too Many Scripts"


Edit history:

  • s/toolbox/toolkit to match Brendan's blog post. Add link to Brendan's blog post.
@frisso
Copy link

frisso commented May 4, 2022

Completely agreed.

@chenhengqi
Copy link
Collaborator

Does this mean BCC is reaching its EOL ?

@davemarchevsky
Copy link
Collaborator Author

davemarchevsky commented May 4, 2022

Does this mean BCC is reaching its EOL ?

@chenhengqi, maybe best to split my thoughts here:

  • libbpf tools: We should encourage folks to maintain + improve existing tools, convert bcc tools to libbpf, and add net-new tools where it makes sense. These should be the 'gold standard' tools that users new to the BPF observability ecosystem can use to answer questions in production and learn best practices for writing BPF applications.

  • bcc tools: We should discourage the addition of new tools using the bcc framework and push people to focus on libbpf-tools instead. The rare tool that is not yet possible or reasonable to write using bcc framework should be considered a feature request for libbpf. 99% of tools don't need to ship LLVM around with them.

  • bcc the framework:

    • We should strongly discourage the use of bcc the framework for new BPF applications. We should have a concise explanation of why libbpf is better.
    • We (me, specifically) should finish the work to use libbpf's loader in lieu of custom loading functionality. This will make it easy to enable new features (e.g. global vars, bpf_loop) with minimal code/maintenance overhead.
    • We should not seek to stop maintaining the bcc framework at this time.

For bcc tools vs libbpf tools, I probably have a blind spot in my logic when it comes to the ability to write Python and Lua userspace side of BPF applications. Maybe more official libbpf bindings as a migration path from the bcc bindings would satisfy this.

In general, though, I don't think we should consider this an EOL discussion, moreso a clarification of priorities and best practices.

@chenhengqi
Copy link
Collaborator

OK, that's clear.

Though discouraged, BCC is still very useful for prototyping and PoC.

@yonghong-song
Copy link
Collaborator

@chenhengqi sorry for confusion. In short summary, (1). bcc repo (the infra and all existing tools) will continue to be maintained. (2). new bcc tools will need more scrutiny for real impactful use cases as we already have quite some tools which make them hard to manage. (3). new python tools are discouraged and going forward libbpf-tools are favored and python->libbpf tool conversions are encouraged. (4). if some tools are useful but only in limited use cases, to avoid polluting main tools directory, a different directory may be created to separate those popular and well-maintained tools from those limited-use-case and less-maintained tools. A limited-use-case/less-maintained tools can be promoted to the tools directory if it has shown its usefulness with impactful use case and well-maintenance. A well-maintained tool needs to work properly with new kernel versions (not just making it run as some kernel logic may change with newer kernel).

@chenhengqi
Copy link
Collaborator

Thanks for the clarification. :)

@vladd12
Copy link

vladd12 commented Feb 9, 2023

Separating repo is good idea, this is logically correct. May I help you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants