Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processes name stop displaying after some time #312

Open
qmonnet opened this issue Jan 18, 2024 · 6 comments
Open

Processes name stop displaying after some time #312

qmonnet opened this issue Jan 18, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@qmonnet
Copy link
Member

qmonnet commented Jan 18, 2024

I've got pwru log file that includes information on the processes that created the SKBs. At the beginning of the logs, the name and PID of the processes are displayed as expected. Starting with line 74415 (record 74414), the name of the processes that created the SKBs is no longer displayed, and shows as <empty>. The PID still shows up as expected.

$ pwru --version
pwru v1.0.4-1-g1e10c197d1c0

$ sudo pwru \
    --output-tuple \
    --output-meta \
    --output-file=/tmp/pwru.log \
    --ready-file=/tmp/pwru.ready \
    'host 10.244.2.78 or port 60154'

pwru.log

@jschwinger233
Copy link
Member

This is because pwru fetches pid in kernel space while converts pid to process name in user space, so if the process sending skb is terminated before pwru starts the pid->pname convertion, no process can be found to match pid.

Indeed we can fetch pname in bpf kernel space via task->comm or task->mm->arg_start, but that approach might impact performance.

Another easy fix is to cache pname in pwru user space during an skb's lifetime.

@qmonnet
Copy link
Member Author

qmonnet commented Jan 22, 2024

Indeed we can fetch pname in bpf kernel space via task->comm or task->mm->arg_start, but that approach might impact performance.

Probably not worth the trade-off. If this happens only when the process is terminated early, then I'd say it's fine to miss info like the process name, rather than slow down the whole execution.

Another easy fix is to cache pname in pwru user space during an skb's lifetime.

Sounds like a better option to me, for what it's worth.

@brb brb added the bug Something isn't working label Feb 14, 2024
@brb
Copy link
Member

brb commented Feb 14, 2024

Another easy fix is to cache pname in pwru user space during an skb's lifetime.

Any TL;DR how it would work?

@jschwinger233
Copy link
Member

I was thinking using a user-space map[skb]pname to cache pname conversion, but that actually doesn't make sense because an skb could be transferred between processes (e.g. from a client process to a server process).

Then I thought we could cache map[pid]pname, but the entry eviction is tricky, we should find an appropriate event to delete an pid->pname entry in map, instead of caching the entry forever. LRU? Timer? Tracing the lifetime of a process?

@brb
Copy link
Member

brb commented Feb 15, 2024

Not sure about others, but TBH for me a process name is very little of use when debugging.

@qmonnet
Copy link
Member Author

qmonnet commented Feb 15, 2024

Then I thought we could cache map[pid]pname, but the entry eviction is tricky, we should find an appropriate event to delete an pid->pname entry in map, instead of caching the entry forever. LRU? Timer? Tracing the lifetime of a process?

LRU, don't evict, just read from the cache when you cannot get the pname from the regular conversion method?

Not sure about others, but TBH for me a process name is very little of use when debugging.

Agreed it's nice to have, but not a big issue if it's occasionally broken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants