New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
processes: collect PSS memory usage for each process #4122
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Stephen Foulds <stephen.foulds@consult.red>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine except for few nit-picks, but I would suggest parsing also (at least) Private_Dirty
from Smaps file at the same time, as all this functionality just for Pss
value seem a bit waste.
I assume this to (at least) double data parsing overhead (both user-space + kernel side), but Pss
and Private_Dirty
are much more useful data than the earlier provided values, so it's a good compromise.
errno = 0; | ||
endptr = NULL; | ||
tmp = strtoul(fields[1], &endptr, 10); | ||
if ((errno == 0) && endptr != fields[1]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should check endptr
for 'nil' character, as according to strtoul()
manual page:
if *nptr is not '\0' but **endptr is '\0' on return, the entire string is valid
And apparently all implementations might not set errno:
If there were no digits at all, strtoul() stores the original value of nptr in *endptr (and returns 0).
...
The implementation may also set errno to EINVAL in case no conversion was performed (no digits seen, and 0 returned).
// smaps_rollup requires kernel >= 4.14 | ||
struct stat sb; | ||
if (stat("/proc/1/smaps_rollup", &sb) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add link to doc to the comment:
https://www.kernel.org/doc/Documentation/ABI/testing/procfs-smaps_rollup
And skip dummy variable use by using access()
instead of stat()
.
In case this gets run in setup where init process proc entries are inaccessible, it could check /proc/self/
dir instead.
tmp = strtoul(fields[1], &endptr, 10); | ||
if ((errno == 0) && endptr != fields[1]) { | ||
pss += tmp; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rollup case, loop can be once the value has been parsed.
@@ -1226,6 +1244,55 @@ static int ps_count_fd(int pid) { | |||
return (count >= 1) ? count : 1; | |||
} /* int ps_count_fd (pid) */ | |||
|
|||
static int ps_get_pss(process_entry_t *ps) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#if KERNEL_LINUX
?
@@ -1411,6 +1478,12 @@ static int ps_read_process(long pid, process_entry_t *ps, char *state) { | |||
return 0; | |||
} | |||
|
|||
if (ps_get_pss(ps) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#if KERNEL_LINUX
?
Alternatively, ps_get_pss()
could be dummy on other OSes:
static int ps_get_pss(process_entry_t *ps) { return 0; }
FILE *fh; | ||
char buffer[1024]; | ||
char filename[64]; | ||
char *fields[4]; | ||
int numfields; | ||
unsigned long pss = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please define variables as late as possible.
https://github.com/collectd/collectd/blob/main/docs/review_comments.md#define-variables-on-first-use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TeknoVenus, thank you very much for your PR!
I think this looks quite good overall and with @eero-t and my comments addressed this should be good to go.
Thanks and best regards
—octo
@@ -1411,6 +1478,12 @@ static int ps_read_process(long pid, process_entry_t *ps, char *state) { | |||
return 0; | |||
} | |||
|
|||
if (ps_get_pss(ps) != 0) { | |||
/* no pss data */ | |||
ps->vmem_pss = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vmem_pss
is an unsigned long
. Assigning -1 here will cause problems with compilers. I suggest to use zero here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first thought there could be some tasks with zero PSS, but kernel tasks apparently do not have smaps_rollup
files, and their smaps
files are empty, so zero PSS seems OK indicator for missing data (otherwise I would have suggested ULONG_MAX
).
@@ -843,6 +856,11 @@ static void ps_submit_proc_list(procstat_t *ps) { | |||
vl.values_len = 1; | |||
plugin_dispatch_values(&vl); | |||
|
|||
sstrncpy(vl.type, "ps_pss", sizeof(vl.type)); | |||
vl.values[0].gauge = ps->vmem_pss; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below, vmem_pss
is set to a well-known value on error. Add a check for this well-known value here and set the gauge to NAN
if there was an error:
vl.values[0].gauge = ps->vmem_pss;
if (ps->vmem_pss == 0) {
vl.values[0].gauge = NAN;
}
ChangeLog: Processes: Collect PSS memory usage per process
If available, will use the smaps_rollup file for performance reasons. Otherwise will use the smaps file and sum up all the PSS fields (which is just what the smaps_rollup file is doing internally).