Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: when likwid-powermeter a.out, does it measure the power consumption of the whole CPU or just of a.out? #617

Open
Sunt-ing opened this issue Mar 25, 2024 · 9 comments

Comments

@Sunt-ing
Copy link

I did not find the answer in the doc or issue list. Thanks for answering in advance!

@TomTheBear
Copy link
Member

I updated the page of likwid-powermeter in the Wiki. Specifically, I added a section which describes which RAPL domains measure what: https://github.com/RRZE-HPC/likwid/wiki/Likwid-Powermeter#common-domains

Moreover, at this command, I updated the text:

Next you can use likwid-powermeter as a wrapper to output the energy consumed by all domains for the runtime of an application:
likwid-powermeter a.out

Does this answer your question? If not, can you please provide feedback and I will update the docs more.

@Sunt-ing
Copy link
Author

Thanks for your reply!

I got the following output, and I guess it means: my application used two CPUs (CPU 0 and CPU 10); during the running, CPU 0 consumed 3364.62 J; CPU 10 consumed 5068.51 Joules.

Is my understanding correct? Thanks for your help!

Machine configuration: Two Intel Xeon Silver 4114 10-core CPUs at 2.20 GHz.

Runtime: 100.139 s
Measure for socket 0 on CPU 0
Domain PKG:
Energy consumed: 3364.62 Joules
Power consumed: 33.5996 Watt
Domain PP0:
Energy consumed: 0 Joules
Power consumed: 0 Watt
Domain DRAM:
Energy consumed: 1378.37 Joules
Power consumed: 13.7646 Watt
Domain PLATFORM:
Energy consumed: 0 Joules
Power consumed: 0 Watt

Measure for socket 1 on CPU 10
Domain PKG:
Energy consumed: 5068.51 Joules
Power consumed: 50.615 Watt
Domain PP0:
Energy consumed: 0 Joules
Power consumed: 0 Watt
Domain DRAM:
Energy consumed: 605.982 Joules
Power consumed: 6.05144 Watt
Domain PLATFORM:
Energy consumed: 0 Joules
Power consumed: 0 Watt

@TomTheBear
Copy link
Member

Your system has two CPU sockets. Since the energy units exist only once per socket, one CPU per socket is selected to measure the energy stuff. In your case, that's CPU 0 and CPU 10. The IDs for the CPUs are less important, more important is socket 0 and socket 1.

my application used two CPUs (CPU 0 and CPU 10); during the running, CPU 0 consumed 3364.62 J; CPU 10 consumed 5068.51 Joules.

My system has two CPUs (socket 0 and socket1); during the running, all cores of socket 0 consumed 3364.62 J; all cores of socket 1 consumed 5068.51 Joules. Moreover, the memory DIMMs attached to socket 0 consumed 1378.37 J, the memory DIMMs attached to socket 1 consumed 605.982 J.

(PKG domain = all cores of the socket, DRAM domain = all memory DIMMs of the socket)

@TomTheBear
Copy link
Member

I added more text to the likwid-powermeter wiki page. Can you please check whether it explains it now. If not, feedback is appreciated.

@Sunt-ing
Copy link
Author

Thanks for your answer! Now I know the meaning exactly.

Another question is now I want to do scheduling based on the output information. My scheduler is supposed to be written in Python, so it looks like the best way is to read the output text and then manually parse them, right? Thanks for your help!

@TomTheBear
Copy link
Member

Scheduling is a broad term, so I'm not 100% sure what you mean with it. But generally, if you want to integrate LIKWID into something else, I would use the library directly. All operations done by the command line tools are available in the LIKWID API. And luckily for you, there is a Python interface to this API: https://github.com/RRZE-HPC/pylikwid#energy . It might not provide the latest features, but it should work (it's currently in-use on the system I'm typing on). If you have questions/problems about the Python LIKWID API, please open an issue in the pylikwid repository.

@Sunt-ing
Copy link
Author

My scheduling is to change knobs like power limit based on the energy consumption in this computer (2 CPU + 1 GPU). Thanks for your kind help again!

@TomTheBear
Copy link
Member

For changing knobs, I recommend using the experimental sysfeatures component introduced with 5.3.0. The old/current APIs do not provide knobs for setting a power limit. You have to explicitly enable sysfeatures before the build (BUILD_SYSFEATURES=true in config.mk). Unfortunately, there is no support in pylikwid yet but creating it wouldn't be much work. Of course, you could also use the corresponding CLI app likwid-sysfeatures but changes in the output might require updates of your parser.

The sysfeatures component is the new way to get the energy data. Additionally, it provides the knobs like power limits. likwid-powermeter, likwid-setFrequencies and likwid-features will be deprecated in the future because all those features will be provided by likwid-sysfeatures.

The sysfeatures component is still under development, so if you want to join the effort, let me know.

@Sunt-ing
Copy link
Author

I will take a look and use them. Thanks for your careful and informative explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants