Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phoronix-test-suit not monitoring GPU Power Consumption in linux 6.5+ #754

Open
ajoathan opened this issue Nov 4, 2023 · 1 comment
Open

Comments

@ajoathan
Copy link

ajoathan commented Nov 4, 2023

Recently I upgraded my system and noticed that phoronix stopped reporting the GPU Frequency and GPU Power Consumption for my AMD RX 580.

[alvaro@alvaro-gaming phoronix-test-suite]$ ./phoronix-test-suite system-sensors


Phoronix Test Suite v10.8.4
Supported Sensors For This System

cpu.fan-speed    CPU Fan Speed:                               774     RPM       
cpu.freq         CPU Frequency (CPU0):                        3553.08 Megahertz 
cpu.freq         CPU Frequency (CPU1):                        400.00  Megahertz 
cpu.freq         CPU Frequency (CPU2):                        400.00  Megahertz 
cpu.freq         CPU Frequency (CPU3):                        400.00  Megahertz 
cpu.freq         CPU Frequency (CPU4):                        400.00  Megahertz 
cpu.freq         CPU Frequency (CPU5):                        3632.00 Megahertz 
cpu.freq         CPU Frequency (CPU6):                        400.00  Megahertz 
cpu.freq         CPU Frequency (CPU7):                        3552.72 Megahertz 
cpu.freq         CPU Frequency (CPU8):                        3906.27 Megahertz 
cpu.freq         CPU Frequency (CPU9):                        4198.66 Megahertz 
cpu.freq         CPU Frequency (CPU10):                       3550.58 Megahertz 
cpu.freq         CPU Frequency (CPU11):                       400.00  Megahertz 
cpu.peak-freq    CPU Peak Freq (Highest CPU Core Frequency):  4441    Megahertz 
cpu.temp         CPU Temperature:                             35.75   Celsius   
cpu.usage        CPU Usage (CPU0):                            0.00    Percent   
cpu.usage        CPU Usage (CPU1):                            0.00    Percent   
cpu.usage        CPU Usage (CPU2):                            0.00    Percent   
cpu.usage        CPU Usage (CPU3):                            0.00    Percent   
cpu.usage        CPU Usage (CPU4):                            0.00    Percent   
cpu.usage        CPU Usage (CPU5):                            0.00    Percent   
cpu.usage        CPU Usage (CPU6):                            0.00    Percent   
cpu.usage        CPU Usage (CPU7):                            0.00    Percent   
cpu.usage        CPU Usage (CPU8):                            0.00    Percent   
cpu.usage        CPU Usage (CPU9):                            0.00    Percent   
cpu.usage        CPU Usage (CPU10):                           0.00    Percent   
cpu.usage        CPU Usage (CPU11):                           0.00    Percent   
cpu.usage        CPU Usage (Summary):                         0.00    Percent   
gpu.fan-speed    GPU Fan Speed:                               100     Percent   
gpu.temp         GPU Temperature:                             45.00   Celsius   
hdd.read-speed   Drive Read Speed (sda):                      0.00    MB/s      
hdd.read-speed   Drive Read Speed (sdb):                      0.00    MB/s      
hdd.write-speed  Drive Write Speed (sda):                     0.00    MB/s      
hdd.write-speed  Drive Write Speed (sdb):                     0.00    MB/s      
memory.usage     Memory Usage:                                1519    Megabytes 
swap.usage       Swap Usage:                                  0       Megabytes 
sys.fan-speed    System Fan Speed:                            774     RPM       
sys.iowait       System Iowait:                               0.00    Percent   

Unsupported Sensors For This System

- Ambient Temperature
- Cgroup Cpu Usage
- CPU Power Consumption
- CPU Voltage
- GPU Frequency
- GPU Memory Usage
- GPU Power Consumption
- GPU Usage
- GPU Voltage
- Drive Temperature
- Memory Temperature
- System Power Consumption
- System Temperature
- System Voltage

To test if the problem was the upgrade I reboot the system with the kernel 6.1 LTS from my distribution and with it phoronix is back monitoring this values:

[alvaro@alvaro-gaming phoronix-test-suite]$ ./phoronix-test-suite system-sensors


Phoronix Test Suite v10.8.4
Supported Sensors For This System

cpu.fan-speed     CPU Fan Speed:                               2068    RPM       
cpu.freq          CPU Frequency (CPU0):                        4141.71 Megahertz 
cpu.freq          CPU Frequency (CPU1):                        1400.00 Megahertz 
cpu.freq          CPU Frequency (CPU2):                        1400.00 Megahertz 
cpu.freq          CPU Frequency (CPU3):                        1400.00 Megahertz 
cpu.freq          CPU Frequency (CPU4):                        3544.86 Megahertz 
cpu.freq          CPU Frequency (CPU5):                        3900.00 Megahertz 
cpu.freq          CPU Frequency (CPU6):                        3554.10 Megahertz 
cpu.freq          CPU Frequency (CPU7):                        4284.21 Megahertz 
cpu.freq          CPU Frequency (CPU8):                        3788.11 Megahertz 
cpu.freq          CPU Frequency (CPU9):                        4438.82 Megahertz 
cpu.freq          CPU Frequency (CPU10):                       3663.93 Megahertz 
cpu.freq          CPU Frequency (CPU11):                       3549.66 Megahertz 
cpu.peak-freq     CPU Peak Freq (Highest CPU Core Frequency):  4217    Megahertz 
cpu.temp          CPU Temperature:                             37.00   Celsius   
cpu.usage         CPU Usage (CPU0):                            0.00    Percent   
cpu.usage         CPU Usage (CPU1):                            0.00    Percent   
cpu.usage         CPU Usage (CPU2):                            0.00    Percent   
cpu.usage         CPU Usage (CPU3):                            0.00    Percent   
cpu.usage         CPU Usage (CPU4):                            0.00    Percent   
cpu.usage         CPU Usage (CPU5):                            0.00    Percent   
cpu.usage         CPU Usage (CPU6):                            0.00    Percent   
cpu.usage         CPU Usage (CPU7):                            0.00    Percent   
cpu.usage         CPU Usage (CPU8):                            1.96    Percent   
cpu.usage         CPU Usage (CPU9):                            0.00    Percent   
cpu.usage         CPU Usage (CPU10):                           0.00    Percent   
cpu.usage         CPU Usage (CPU11):                           0.00    Percent   
cpu.usage         CPU Usage (Summary):                         0.00    Percent   
gpu.fan-speed     GPU Fan Speed:                               55.89   Percent   
gpu.freq          GPU Frequency:                               300     Megahertz 
gpu.memory-usage  GPU Memory Usage:                            159     Megabytes 
gpu.power         GPU Power Consumption:                       7.163   Watts     
gpu.temp          GPU Temperature:                             43.00   Celsius   
gpu.usage         GPU Usage:                                   0       Percent   
hdd.read-speed    Drive Read Speed (sda):                      0.00    MB/s      
hdd.read-speed    Drive Read Speed (sdb):                      0.00    MB/s      
hdd.write-speed   Drive Write Speed (sda):                     0.00    MB/s      
hdd.write-speed   Drive Write Speed (sdb):                     0.00    MB/s      
memory.usage      Memory Usage:                                940     Megabytes 
swap.usage        Swap Usage:                                  0       Megabytes 
sys.fan-speed     System Fan Speed:                            2068    RPM       
sys.iowait        System Iowait:                               0.00    Percent   

Unsupported Sensors For This System

- Ambient Temperature
- Cgroup Cpu Usage
- CPU Power Consumption
- CPU Voltage
- GPU Voltage
- Drive Temperature
- Memory Temperature
- System Power Consumption
- System Temperature
- System Voltage

Please note that the sensors for GPU Frequency, GPU Memory Usage, GPU Power Consumption and GPU Usage where listed as unsupported in the 6.5.5, and as supported with 6.1.55.

The only difference I see was the indexing that the GPU got on DRM. With 6.1.55:

[alvaro@alvaro-gaming ~]$ ls /sys/class/drm
card0  card0-DP-1  card0-DP-2  card0-DP-3  card0-DVI-D-1  card0-HDMI-A-1  renderD128  version

And with 6.5.5:

[alvaro@alvaro-gaming phoronix-test-suite]$ ls /sys/class/drm
card1  card1-DP-1  card1-DP-2  card1-DP-3  card1-DVI-D-1  card1-HDMI-A-1  renderD128  version

But I don't know how these are defined or how to force then in order to test. Also I don't think this should influence the sensors reading.

I also didn't try to run this on other kernels, so I don't know when it started. I just try what I had easily available from my distro.

Some extra info:

[alvaro@alvaro-gaming phoronix-test-suite]$ ./phoronix-test-suite system-info


Phoronix Test Suite v10.8.4
System Information

egrep: warning: egrep is obsolescent; using grep -E

  PROCESSOR:              AMD Ryzen 5 5600G @ 4.46GHz
    Core Count:           6                                                   
    Thread Count:         12                                                  
    Extensions:           SSE 4.2 + AVX2 + AVX + RDRAND + FSGSBASE            
    Cache Size:           0.5 MB                                              
    Microcode:            0xa50000d                                           
    Core Family:          Zen 3                                               
    Scaling Driver:       amd-pstate-epp powersave (EPP: balance_performance) 

  GRAPHICS:               Gigabyte AMD Radeon RX 580 4GB
    BAR1 / Visible vRAM:  256 MB                                            
    OpenGL:               4.6 Mesa 23.1.9-manjaro1.1 (LLVM 16.0.6 DRM 3.54) 
    Monitor:              LG ULTRAGEAR                                      
    Screen:               1920x1080                                         

  MOTHERBOARD:            ASUS PRIME B450M-GAMING/BR
    BIOS Version:         4203                      
    Chipset:              AMD Renoir/Cezanne        
    Audio:                AMD Ellesmere HDMI Audio  
    Network:              Realtek RTL8111/8168/8411 

  MEMORY:                 16GB

  DISK:                   240GB SanDisk SDSSDA24 + 256GB HS-SSD-E100N 256
    File-System:          ext4             
    Mount Options:        noatime rw       
    Disk Scheduler:       MQ-DEADLINE      
    Disk Details:         Block Size: 4096 

  OPERATING SYSTEM:       ManjaroLinux 23.0.4
    Kernel:               6.5.5-1-MANJARO (x86_64)                                                                                                
    Desktop:              Xfce 4.18                                                                                                               
    Display Server:       X Server 1.21.1.8                                                                                                       
    Compiler:             GCC 13.2.1 20230801 + Clang 16.0.6 + LLVM 16.0.6                                                                        
    Security:             gather_data_sampling: Not affected                                                                                      
                          + itlb_multihit: Not affected                                                                                           
                          + l1tf: Not affected                                                                                                    
                          + mds: Not affected                                                                                                     
                          + meltdown: Not affected                                                                                                
                          + mmio_stale_data: Not affected                                                                                         
                          + retbleed: Not affected                                                                                                
                          + spec_rstack_overflow: Mitigation of safe RET no microcode                                                             
                          + spec_store_bypass: Mitigation of SSB disabled via prctl                                                               
                          + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization                                    
                          + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected 
                          + srbds: Not affected                                                                                                   
                          + tsx_async_abort: Not affected                                                                                         

The system info for kernel 6.1.55 is almost identical, but it show one extra line with the graphics frequency. I can post it here if anyone think it is relevant, along with any other data you may need to figure this out.

Any help is greatly appreciated.

Thanks in advance.

@ajoathan
Copy link
Author

After some digging in dmesg today I discovered that part of the problem was the SimpleDRM driver that was enabled by default on my distribution kernel. It than get the drm index 0 (minor 0 as shown in dmesg), and amdgpu would be assigned index 1 (minor 1).

After some search on the internet I discovered the boot parameter initcall_blacklist=simpledrm_platform_driver_init tha can be used to inhibit SimpleDRM at boot time. After this most sensors are back on my system, except GPU power:

[alvaro@alvaro-gaming phoronix-test-suite]$ ./phoronix-test-suite system-sensors


Phoronix Test Suite v10.8.4
Supported Sensors For This System

cpu.fan-speed     CPU Fan Speed:                               2057    RPM       
cpu.freq          CPU Frequency (CPU0):                        3546.34 Megahertz 
cpu.freq          CPU Frequency (CPU1):                        400.00  Megahertz 
cpu.freq          CPU Frequency (CPU2):                        400.00  Megahertz 
cpu.freq          CPU Frequency (CPU3):                        3553.66 Megahertz 
cpu.freq          CPU Frequency (CPU4):                        3857.20 Megahertz 
cpu.freq          CPU Frequency (CPU5):                        4384.18 Megahertz 
cpu.freq          CPU Frequency (CPU6):                        400.00  Megahertz 
cpu.freq          CPU Frequency (CPU7):                        4203.22 Megahertz 
cpu.freq          CPU Frequency (CPU8):                        4239.17 Megahertz 
cpu.freq          CPU Frequency (CPU9):                        400.00  Megahertz 
cpu.freq          CPU Frequency (CPU10):                       400.00  Megahertz 
cpu.freq          CPU Frequency (CPU11):                       400.00  Megahertz 
cpu.peak-freq     CPU Peak Freq (Highest CPU Core Frequency):  4246    Megahertz 
cpu.temp          CPU Temperature:                             49.38   Celsius   
cpu.usage         CPU Usage (CPU0):                            0.00    Percent   
cpu.usage         CPU Usage (CPU1):                            0.00    Percent   
cpu.usage         CPU Usage (CPU2):                            0.00    Percent   
cpu.usage         CPU Usage (CPU3):                            0.00    Percent   
cpu.usage         CPU Usage (CPU4):                            0.00    Percent   
cpu.usage         CPU Usage (CPU5):                            0.00    Percent   
cpu.usage         CPU Usage (CPU6):                            0.00    Percent   
cpu.usage         CPU Usage (CPU7):                            0.00    Percent   
cpu.usage         CPU Usage (CPU8):                            0.00    Percent   
cpu.usage         CPU Usage (CPU9):                            0.00    Percent   
cpu.usage         CPU Usage (CPU10):                           0.00    Percent   
cpu.usage         CPU Usage (CPU11):                           0.00    Percent   
cpu.usage         CPU Usage (Summary):                         0.00    Percent   
gpu.fan-speed     GPU Fan Speed:                               55.59   Percent   
gpu.freq          GPU Frequency:                               300     Megahertz 
gpu.memory-usage  GPU Memory Usage:                            124     Megabytes 
gpu.temp          GPU Temperature:                             39.00   Celsius   
gpu.usage         GPU Usage:                                   0       Percent   
hdd.read-speed    Drive Read Speed (sda):                      0.00    MB/s      
hdd.read-speed    Drive Read Speed (sdb):                      0.00    MB/s      
hdd.write-speed   Drive Write Speed (sda):                     0.00    MB/s      
hdd.write-speed   Drive Write Speed (sdb):                     0.00    MB/s      
memory.usage      Memory Usage:                                935     Megabytes 
swap.usage        Swap Usage:                                  0       Megabytes 
sys.fan-speed     System Fan Speed:                            2057    RPM       
sys.iowait        System Iowait:                               0.00    Percent   

Unsupported Sensors For This System

- Ambient Temperature
- Cgroup Cpu Usage
- CPU Power Consumption
- CPU Voltage
- GPU Power Consumption
- GPU Voltage
- Drive Temperature
- Memory Temperature
- System Power Consumption
- System Temperature
- System Voltage

The only one I need who still not available is "GPU Power Consumption". But I will keep digging.

@ajoathan ajoathan changed the title phoronix-test-suit not monitoring GPU Frequency in linux 6.5.5 phoronix-test-suit not monitoring GPU Power Consumption in linux 6.5.5 Nov 11, 2023
@ajoathan ajoathan changed the title phoronix-test-suit not monitoring GPU Power Consumption in linux 6.5.5 phoronix-test-suit not monitoring GPU Power Consumption in linux 6.5+ Nov 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant