Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT]: add support for some Tegra NVDEC/NVENC/VIC monitoring, generic GPU usage, and EMC usage #1322

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

theofficialgman
Copy link

@theofficialgman theofficialgman commented May 10, 2024

I am opening this as a draft as it is not mergeable in its current state.

This PR adds support for some of the Nvidia Tegra system of chips that have NVDEC/NVENC/VIC stats available via /sys/kernel/debug interfaces. These are the same stats as used in tegrastats. They require that the current user be able to read from this directory (not the default) so I am looking for advise on implementing some sort of functionality that would allow for mangohud to read these files without needing to modify permissions on every reboot of that directory.

This PR also adds generic support for reading the CPU-therm temperature sensors (also as used on the Tegra chips and potentially other SOCs)

Finally, this PR also adds support for reading the GPU load from /sys/devices/gpu.0/load and a hardware specific frequency from /sys/devices/gpu.0/devfreq/57000000.gpu/cur_freq (a loop could be added to find the required folder, similar to https://github.com/hakandundar34coding/system-monitoring-center/blob/e0c6246d38af487a378adf486eed61485eea7566/src/Gpu.py#L657-L663).

As previously explained, this PR is not mergeable in its current state and I am looking for feedback on its contents and where each of these functionality can be best placed. I can split up the PR into separate PRs if desired based on feedback.

closes #864

added mem clock and controller load (if current permissions permit reading them), gpu clock, gpu load, and cpu temp
gpu_core_clock
gpu_nvdec_clock
gpu_nvenc_clock
gpu_vic_clock
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't these just fit under the gpu_core_clock param?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym? like enable all of them if only the gpu_core_clock param is enabled? I thought the granularity would be nice since not all users will want to waste space showing all of them if they only want one.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be understand the structure correct.
Can all 3-4 of these be monitored at the same time?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are different hardware units. their clocks are different and independent

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I fully understand, one device will only have one of these?
In that case I don't see a point in the granularity

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not understand. no

these are different hardware engines on the same SOC. nvdec (nvidia decode engine), nvenc (nvidia encode engine), and vic (nvidia video image compositor) are all present on the same SOC.

example during chromium hardware accelerated video decode and playback:

Screenshot_20240528_133546

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying. It seems fine to have different params for this.
As for how it's displayed, this seems a bit clunky but I also don't know a better way to do it

also add loop for gpu.0/devfreq and exit on first entry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPU % load is always 0 (nintendo switch, nvidia jetson)
2 participants