Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CSV Output from pidstat #174

Open
tst-ppenev opened this issue Feb 28, 2018 · 5 comments · May be fixed by #175
Open

Support CSV Output from pidstat #174

tst-ppenev opened this issue Feb 28, 2018 · 5 comments · May be fixed by #175

Comments

@tst-ppenev
Copy link

As a pidstat user, I would like to be able to receive the statistics for each time period as a CSV file. That makes it easier to graph results, and correlate them with data from profiling tools.

Note from issue creator:

I'm not familiar with conventions for contributing to the SysStat project, and I didn't see a developer guide, so this is my best effort to open a ticket and submit a pull request. Advice is appreciated.

tst-ppenev added a commit to tst-ppenev/sysstat that referenced this issue Feb 28, 2018
@tst-ppenev tst-ppenev linked a pull request Feb 28, 2018 that will close this issue
tst-ppenev added a commit to tst-ppenev/sysstat that referenced this issue Feb 28, 2018
tst-ppenev added a commit to tst-ppenev/sysstat that referenced this issue Feb 28, 2018
tst-ppenev added a commit to tst-ppenev/sysstat that referenced this issue Mar 1, 2018
* Bug fix: The 'bytes written', 'bytes read' and 'write bytes cancelled'
  statistics that were output to CSV were calculated improperly, reporting the
  cumulative number of bytes since process start, rather than the number of
  bytes for each sample interval.

* Slight refactor to output CSV stats for children, or tasks, based on the
  `-T` command line option.
@sysstat
Copy link
Owner

sysstat commented Mar 3, 2018

Thank you for your contribution.
First, I have to tell you that I been unable to pull any changes from your development branch:

$ git checkout -b tst-ppenev-#174-pidstat-csv-output master
Switched to a new branch 'tst-ppenev-#174-pidstat-csv-output'

$ git pull https://github.com/tst-ppenev/sysstat.git #174-pidstat-csv-output
From https://github.com/tst-ppenev/sysstat
 * branch            HEAD       -> FETCH_HEAD
Already up-to-date.

I didn't have time to google for an explanation. Anyway I mean that I may have other remarks once I know how to fix the problem. Some remarks below may even don't apply if I didn't understand properly how your patch works.

So here are several things that you should take into account before this feature can get included in sysstat/pidstat. The idea is to keep consistent with how other options from other sysstat commands work:

  • Don’t create a CSV file in addition to the report already displayed on standard output. If the user selects the CSV output format then pidstat will display its statistics only in this format to the standard output. To create a CSV file, the user will need to redirect the output to a file. See how things are done for other sysstat commands that already support a different output format (e.g. iostat and mpstat with their JSON output format).
  • This is not pidstat’s goal to give the user access to the kernel’s counters value. Pidstat calculates metrics for tasks (number of I/O per second, CPU utilization, etc.) and display them. The fact that the output will be a CSV format should not change this. Display the metrics, not the counters value (or the number of clock ticks...)
  • Keep consistent with the option’s name. iostat and mpstat use “-o JSON” to output their data in JSON format. I would like you to use “-o CSV” here in pidstat.
  • I know that CSV actually means Comma-Separated Values, but please use a semi-colon instead of a comma as a delimiter since the comma may be used as the decimal point in certain countries.
  • Please also comply with sysstat’s coding standards. In particular your comments should look like this:

/* This is a comment on a single line */
Or

/*
 * This is a comment written on
 * several lines.
 */

But not:
// This is your comment:

@tst-ppenev
Copy link
Author

Hi, Sebastien.

Thanks for the comments. I can address some of them. However, my team is trying to use pidstat for profiling running executables. Having detailed information like clock tick counts for each thread can be very useful for that purpose. However, if that is not a use case for pidstat, we can fork off the collection of that type of data into a separate tool.

Here's what I have been able to use to check out my feature branch. (I believe your git checkout command is wrong. It creates a new branch called tst-ppenev-#174-pidstat-csv-output master, rather than checking out an existing branch called #174-pidstat-csv-output.)

git clone https://github.com/tst-ppenev/sysstat.git
cd sysstat
git checkout '#174-pidstat-csv-output'

Q: Are you able to comment on the individual code lines of the pull request at https://github.com/sysstat/sysstat/pull/175/files?

@tst-ppenev
Copy link
Author

@sysstat: The reason I added an output file path to the CSV format switch is that I didn't find an easy way to separate the output of pidstat. How do you separate the standard output of pidstat from the standard output of the command you're getting statistics for.

E.g., if you create

/tmp/sample-command.sh:

! /bin/bash

echo "$0:" "The output is standard."
sleep 2
echo "$0:" "The standard is output."

and run:

pidstat 1 -e /tmp/sample-command.sh > sample-command.pidstat-output.txt

You get:

sample-command.pidstat-output.txt:

Linux 4.13.0-36-generic (pav-ThinkPad-P51s) 	03/05/2018 	_x86_64_	(4 CPU)
/tmp/sample-command.sh: The output is standard.

05:56:24 PM   UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
05:56:25 PM  1000     17304    0.00    0.00    0.00    0.00    0.00     3  sample-command.
05:56:26 PM  1000     17304    0.00    0.00    0.00    0.00    0.00     3  sample-command.
/tmp/sample-command.sh: The standard is output.

Average:     1000     17304    0.00    0.00    0.00    0.00    0.00     -  sample-command.

@sysstat
Copy link
Owner

sysstat commented Mar 6, 2018

Yes, that's why pidstat should display either its standard report or its CSV output, not both.
I hope to get some time to test the latest version of your patch at the end of this week. I will then get back in touch with you.

tst-ppenev added a commit to tst-ppenev/sysstat that referenced this issue Mar 14, 2018
@reikred
Copy link

reikred commented Apr 23, 2019

@tst-ppenev @sysstat Hi, it sounds good to have CSV output option for pidstat, but how about getting an option also for binary output so that I can postsprocess and make graphs with sadf?

My version is 11.3.5-1.fc25. UPDATE: I compled 12.1.4, seems no binary output from pidstat there either. UPDATE2: I'll post a separate issue for pidstat binary output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants