Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dxorg.cpp:getValueFromSysFsFile() truncates file content containing \0 chars – leading to segfault in dXorg::parseOcTable() with vega20 #279

Open
emerge-e-world opened this issue Nov 24, 2021 · 6 comments

Comments

@emerge-e-world
Copy link

After a recent system upgrade radeon-profile segfaults on startup. I don't know if the output of amdgpu in sysfs changed or if this is caused by a change of behavior of QString after a Qt5 upgrade.

I think I tracked down the cause of the issue though:

getValueFromSysFsFile() in dxorg.cpp truncates sysfs file content after the first \0 character. This happens when the QByteArray returned by QFile::readAll() is converted to a QString. As it so happens, my pp_od_clk_voltage separates the different sections of the table with a bunch of null characters.
In my case (with a Radeon VII / vega20) this leads to dXorg::parseOcTable() only reading the first section (OD_SCLK). getValueFromSysFsFile() returns:

OD_SCLK:
0:        808Mhz
1:       1801Mhz

while the actual content of /sys/class/drm/card0/device/pp_od_clk_voltage is:

OD_SCLK:
0:        808Mhz
1:       1801Mhz
OD_MCLK:
1:       1000Mhz
OD_VDDC_CURVE:
0:        808Mhz        727mV
1:       1304Mhz        798mV
2:       1801Mhz       1076mV
OD_RANGE:
SCLK:     808Mhz       2200Mhz
MCLK:     800Mhz       1200Mhz
VDDC_CURVE_SCLK[0]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[0]:     738mV        1218mV
VDDC_CURVE_SCLK[1]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[1]:     738mV        1218mV
VDDC_CURVE_SCLK[2]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[2]:     738mV        1218mV

This then leads to the bool variable vega20Mode in dXorg::parseOcTable() be set to false, as there is no OD_VDDC_CURVE section present. Inside the main for loop of that function the wrong branch is taken, resulting in an out of range exception when trying to read the third column of the table (while OD_SCLK only contains two) in the line:
831 fvt.insert(state[0].toUInt(), FreqVoltPair(state[1].toUInt(), state[2].toUInt()));
(reading state[2] leads to the segfault).

The fix for me was to handle null characters in sysfs files in getValueFromSysFsFile() by removing all '\0' chars before converting the QByteArray returned from QFile::readAll() to a QString. Here is a simple patch:

--- a/radeon-profile/dxorg.cpp
+++ b/radeon-profile/dxorg.cpp
@@ -61,7 +61,7 @@ QString getValueFromSysFsFile(QString fileName) {
     QString value("-1");
 
     if (f.open(QIODevice::ReadOnly))
-        value = QString(f.readAll()).trimmed();
+        value = QString(f.readAll().replace('\0',"")).trimmed();
 
     f.close();
     return value;
@whit-colm
Copy link

I was having an identical issue and this fixed it completely! Absolutely brilliant! I know the devs aren't really maintaining this much anymore but maybe try making a PR with this?

@emerge-e-world
Copy link
Author

Glad that it was of help to you as well! I created a pull request for the fix.

@dadaas
Copy link

dadaas commented Dec 13, 2021

Ok how do we fix this in our systems? How can we edit dxorg.cpp? I cant find it, where is the location of it? I understand the fix, but cant find dxorg.cpp

I found it, git cloned, edited dxorg.cpp file, I use qmake and make to build it, I build demon also. And I have same problem as before. Radeon-Profile is not reading my manual clocks...

@emerge-e-world
Copy link
Author

Glad you found everything. You can now also directly clone from my forked repository (git clone https://github.com/emerge-e-world/radeon-profile), then you don't have to manually patch the file. I am working on a couple more bug fixes / improvement in there.
What exactly do you mean not reading your manual clocks? Is it still crashing on startup while trying? Or are the clock speeds just wrong?

@MasterCATZ
Copy link

MasterCATZ commented Sep 27, 2022

THANK YOU !!! Works with my 6800 XT

however the overclock tab seems to have issues
running as normal user I have state's running as root user it does not collect any states

and when trying to change memory speed it adds an insane voltage and does not accept any voltage inputs
Screenshot from 2022-09-28 09-35-00
Screenshot from 2022-09-28 09-33-58
Screenshot from 2022-09-28 09-33-34

@Oxalin
Copy link
Contributor

Oxalin commented Jan 27, 2023

I merged the patches from emerge-e-world in my own fork and I'm building toward a new release from there. If you want to test Radeon Profile and Radeon Profile daemon (you need both up to date), have a go over here:

Radeon Profile: https://github.com/Oxalin/radeon-profile
Radeon Profile daemon: https://github.com/Oxalin/radeon-profile-daemon

That being said, your problem with your 6800XT with the overclock tab is probably not fixed yet and I'd like to work with you to fix it, since I don't have access to your card (it seems related to NAVI 2X and above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants