Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More information about resource utilization is needed? #111

Open
lee212 opened this issue Feb 4, 2020 · 4 comments
Open

More information about resource utilization is needed? #111

lee212 opened this issue Feb 4, 2020 · 4 comments

Comments

@lee212
Copy link
Contributor

lee212 commented Feb 4, 2020

Resource utilization is revealed in detail e.g. per resource slot (index) in the matplotlib figure png file but it doesn't seem to have enough information in the stats file. Values in the stats are elapsed seconds of particular metrics e.g. Execution Cmd, Draining, which are important for TTX calculation. What I am interested in, however, is to see how many resources e.g. CPU cores are busy versus idle in a given time. Right now, I manually divide the core seconds of Execution Cmd from the stats file by allocated number of cores to produce a percentage. I believe provided and consumed would be sufficient to be added in the stat file for more information on resource utilization.

@andre-merzky
Copy link
Member

Hi @lee212 ,

can you have a look at the dictionaries returned in https://github.com/radical-cybertools/radical.analytics/blob/devel/bin/rp_inspect/plot_util.py#L115 (from ra.Session.utilization) - does that help?

Those dicts contain rather fine grained information about what unit or pilot utilized what core for what reason. This is an internal data structure, so it's not well documented - but you may want to dump them with pprint and have a look. If that is what you are looking for, I can add some documentation (the method needs that anyway...).

@lee212
Copy link
Contributor Author

lee212 commented Feb 4, 2020

Yes, I was looking at the line

return provided, consumed, stats_abs, stats_rel, info
and I thought two return values, provided and consumed from get_provided_resources and get_consumed_resources might be good to be added in the stat
file.

@andre-merzky
Copy link
Member

The information in them is usually too voluminous to be printed in detail - those basically contain (IIRC) tuples like [core_start, core_end, time_start, time_end] for each and every activity in the pilot, so way to much to present in detail. But out of those numbers, you should be able to write a script which does the additions for the data you are interested in?

@mturilli
Copy link
Contributor

@lee212 should we close this ticket?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants