New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support BTF-based pretty printing #2651
base: master
Are you sure you want to change the base?
Conversation
add the keyword `printb` which means BTF-based pretty printing. Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Add `struct BTFDump` to save `struct btf_dump` information. In general, one BTF corresponds to one `struct BTFDump`. In addition, a new `MessageType::printb` has been added for printing formatted structure data. Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
`Type::printb` is the type of the printb function. `PrintBTF` is the private information of this type, used to store dump_id, type_id and size information. - dump_id: indicates the BTFDump corresponding to the printed structure - type_id: indicates the id of the structure in BTF - size: indicates the size of the structure Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
The main checks are: - Refer to non-map print, printb also only accepts one parameter - The parameter type received by printb must be a pointer type, and the pointer points to the record type In addition, the maximum value size of the percpu map that needs to be created is also recorded. Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
The resources involved are: - save the parameter of printb - create percpu map Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
- Refer to CreateGetJoinMap, add CreateGetPrintBTFMap to get percpu memory. - Created new async event type `struct PrintBTF`. Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
This seems closely related to #1834. That one also suggest to use the |
I use the current approach for several reasons:
Thanks! |
Good point, I didn't realize that.
These are all valid reasons, thanks! It seems that even the ksnoop tool, which was the original use-case for I'll try to get to review this soon. Thanks for the work! |
return; | ||
|
||
Expression *arg = call.vargs->at(0); | ||
if (arg->type.type == Type::pointer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we directly accept a record here? It would require to update the codegen but would be quite a nice feature. In addition, codegen could end up being even simpler than now as printb
could directly use any data that it receives (e.g. from the dereference operator).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel pointer could be better. Because from the scripting point of view, it will be a little simpler. For example printb($skb->sk) instead of printb(*($skb->sk)). What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you go with structs, it will allow to use printb
in more use-cases, e.g. @ = *($skb->sk); ... printb(@)
. Also, it'll be easier implementation-wise b/c now you're always doing proberead to access the memory but that's not always the correct way (e.g. BTF-based probes don't need it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you go with structs, it will allow to use printb in more use-cases, e.g. @ = *($skb->sk); ... printb(@).
In fact, it is now supported. `@=$skb->sk; ... printb(@);.
Also, it'll be easier implementation-wise b/c now you're always doing proberead to access the memory but that's not always the correct way (e.g. BTF-based probes don't need it).
I suddenly remembered why I can't dereference directly, because the general kernel structure is relatively large, and if I dereference it directly, it will occupy the eBPF stack space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, it is now supported. `@=$skb->sk; ... printb(@);.
True, but that requires kernel to be able to store pointers into maps which hasn't been around for long.
I suddenly remembered why I can't dereference directly, because the general kernel structure is relatively large, and if I dereference it directly, it will occupy the eBPF stack space.
IIUC, proberead will also read the data onto BPF stack so there's no difference on that front.
Since printb
is printing the structure, it still feels natural to me to pass directly the structure and not a pointer to the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, it is now supported. `@=$skb->sk; ... printb(@);.
True, but that requires kernel to be able to store pointers into maps which hasn't been around for long.
I suddenly remembered why I can't dereference directly, because the general kernel structure is relatively large, and if I dereference it directly, it will occupy the eBPF stack space.
IIUC, proberead will also read the data onto BPF stack so there's no difference on that front.
We can directly use proberead to read the data into the map. The prototype of proberead is long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
, we can set dst as the value pointer of map.
Since
printb
is printing the structure, it still feels natural to me to pass directly the structure and not a pointer to the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can directly use proberead to read the data into the map. The prototype of proberead is
long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
, we can set dst as the value pointer of map.
Fair point, I didn't realize that. Could we do the same for BTF-based probes by directly loading into the map?
Also, I'm wondering if LLVM would be able to optimize away the stack usage even if we had it in the codegen. If not, perhaps it'd make sense to support both variants (pointer and non-pointer)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can directly use proberead to read the data into the map. The prototype of proberead is
long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
, we can set dst as the value pointer of map.Fair point, I didn't realize that. Could we do the same for BTF-based probes by directly loading into the map?
I understand it should be possible, we can even use percpu map instead of eBPF stack. That is, store all the data in the percpu map.
Also, I'm wondering if LLVM would be able to optimize away the stack usage even if we had it in the codegen. If not, perhaps it'd make sense to support both variants (pointer and non-pointer)?
Sorry, I don't quite understand the meaning here.:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question was that if we have something like this in codegen:
get pointer to skb
proberead skb to BPF stack
store the data (from BPF stack) into printb map
if LLVM optimizations are able to turn it into
get pointer to skb
proberead skb to printb map
If not, then, we'll have to do the optimization on the level of the printb
call and then, passing a pointer seems a better option implementation-wise. On the other hand, user-wise, it seems more appropriate to pass a structure so I'm thinking if we could allow both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand what you mean, llvm should not have this optimization, because the two of them are completely independent memory spaces.
if (is_final_pass()) | ||
{ | ||
auto ids = bpftrace_.btf_->get_dumpid_typeid(ty->GetName()); | ||
if (ids.first < 0 || ids.second < 0) | ||
{ | ||
LOG(ERROR, call.loc, err_) | ||
<< "Failed to get BTFDump for " << ty->GetName(); | ||
return; | ||
} | ||
// extra space for async event header, see `struct PrintBTF`'s | ||
// definition in async_event_types.h | ||
bpftrace_.max_printb_map_size_ = std::max( | ||
bpftrace_.max_printb_map_size_, ty->GetSize() + 8 + 8); | ||
call.type = CreatePrintBTF(ids.first, ids.second, ty->GetSize()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this should be done in ResourceAnalyser
instead of here. Also, we don't need a new type as printb
doesn't return anything and it cannot be used in expressions, so we should just use the none
type here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do, thanks.
@@ -72,6 +79,12 @@ class BTF | |||
|
|||
std::pair<int, int> get_btf_id_fd(const std::string& func, | |||
const std::string& mod) const; | |||
// get dump id and type id | |||
std::pair<int, int> get_dumpid_typeid(const std::string& type_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dump id/type id pair is also used in other places outside of this class. Maybe it'd be worth defining a type for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do, thanks!
Is it necessary to have a new It would feel quite natural to have a specifier for raw struct dump as well as pretty printed dump. |
Using printf is also acceptable. We can convert it into a string through other builtins, such as |
Having new
Good question, wdyt @danobi? Maybe |
The kernel seems to dump most of their pointer-y printf extensions under Perhaps we can try to follow that? A lot of kernel devs use bpftrace so it would probably feel natural for them. |
Now printf uses stack memory, do we need to change printf to use percpu map memory first? Because the stack memory is only 512 bytes, it is definitely not enough. |
@chengshuyi good point. Improving printf() to use percpu map memory would be a nice change. I don't know off the top of my head how much work it'd be, though. I'd rather not block this PR on a big refactor, as Mind taking a look at how much effort a printf() refactor is? If it's an obnoxiously large amount of work, maybe we should merge this first. |
A short example:
bpftrace -e 'kprobe:kfree_skb {printb((struct sk_buff *)arg0)}'
. It will display complete struct sk_buff information, which can improve debugging efficiency. Part of the output is shown below:Checklist
man/adoc/bpftrace.adoc
and if needed indocs/reference_guide.md
CHANGELOG.md