Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update broke sensor detection. #2465

Closed
iDumle opened this issue Apr 19, 2024 · 88 comments
Closed

Update broke sensor detection. #2465

iDumle opened this issue Apr 19, 2024 · 88 comments

Comments

@iDumle
Copy link

iDumle commented Apr 19, 2024

Describe the bug

Update V187 broke sensor detection for GPU Core for my AMD Radeon 6650 XT graphics card and it's now showing an incorrect/frozen value.
Screenshot fancontrol setup2

On version 186 the setup was working fine, and the software had no problem detecting the GPU Core sensor, I'm now using GPU Hot Spot instead as that seems to be updating its values.

Right after update the value was stuck at 468 'c, I then tried restarting twice after the update and the error persisted after reboot though GPU core showed a value of 0 'c and then subsequently 8 'c.

I have tried Refresh sensors detection, but it didn't change a thing it's still stuck on the same value.

Is there a log.txt file next to FanControl.exe with recent date entries?
The Error log show no resent entry, though there is one from back in January I suspect it's unrelated, but just in case I'll post it.
log.txt

Relevant hardware specs and setup
The graphics card is a ASUS Radeon RX 6650 XT DUAL OC - 8GB, and I am running window 10, and the newest graphics drivers 24.3.1.

And here is a picture of the whole Fan Control Setup.
Screenshot fancontrol setup3

@BMKoot
Copy link

BMKoot commented Apr 19, 2024

With my 7900XT, after update (187) for gpu sensors it only shows the VRSoC temp, where before it showed core and hot spot. Reverted (to 186) and it was fixed.

Windows 11, Powercolor Hellhound 7900XT, 24.3.1 drivers.

@seanbirkhead
Copy link

I lost all my custom fan names and paired sensors with update 187.
I restored 186 from a backup and all was fine again.
I think update 187 needs to be looked at.

@iDumle
Copy link
Author

iDumle commented Apr 19, 2024

After digging around a bit, I saw your repo for LibreHardwareMonitor had a resent commit 2 days ago and you added that update to v187, I suspect something went wrong in the d22cd1d commit.
So with that I tried swapping LibreHardwareMonitorLib.dll from v186 in to my v187 install and that restored the sensor functionality.

@Llamalland
Copy link

Similar issue, RX 6800, gpu core and liquid showing temps >3000°, gpu hot spot showing a sensible value. Happy to do more to diagnose/whatever, but for now I've just changed the sensor used to the hot spot.

@Rem0o
Copy link
Owner

Rem0o commented Apr 19, 2024

Yeah there is a new method implemented to get sensor values, but there might be a bug in there.
pmlog.zip

I compiled this: https://github.com/GPUOpen-LibrariesAndSDKs/display-library/blob/master/Sample/PMLog/PMLog.cpp

If anyone here is willing to run it and show the output, might be of some help.

@epinter
Copy link

epinter commented Apr 19, 2024

I've compiled a test build of LibreHardwareMonitor with a possible fix for the issue. If the users with RX 6000 cards can test, will be useful (LibreHardwareMonitorLib.dll included in zip). The temps should be shown correctly with this build. A test copying the new
LibreHardwareMonitorLib to fancontrol would be useful too.

LibreHardwareMonitor-pmlogtest-b-20240419.zip

@iDumle
Copy link
Author

iDumle commented Apr 19, 2024

Yeah there is a new method implemented to get sensor values, but there might be a bug in there. pmlog.zip

I compiled this: https://github.com/GPUOpen-LibrariesAndSDKs/display-library/blob/master/Sample/PMLog/PMLog.cpp

If anyone here is willing to run it and show the output, might be of some help.

@Rem0o
I ran the pmlog file and no output was given to me, a terminal existed for maybe 0.2 sekunds but I am unable to catch if anything is written in it, so I'm not sure how I'm suppose to get an output for you.

I've compiled a test build of LibreHardwareMonitor with a possible fix for the issue. If the users with RX 6000 cards can test, will be useful (LibreHardwareMonitorLib.dll included in zip). The temps should be shown correctly with this build. A test copying the new LibreHardwareMonitorLib to fancontrol would be useful too.

LibreHardwareMonitor-pmlogtest-b-20240419.zip

@epinter
I ran the Libre hardware monitor and all the sensor seems to be working correctly, I opened a game and the temp in LHM corresponded to the temp show by the in game monitor.
LibHwMonOutput

I swapped the LibreHardwareMonitorLib.dll file from LHM to Fan Control v187 folder and ran the software again, and it seems to be working fine now.
FanControlSensorMenu

@epinter
Copy link

epinter commented Apr 19, 2024

@iDumle
The pmlog sample @Rem0o sent is an official sample from AMD that uses that feature pmlog to collect sensors. To run it, you can use these parameters: pmlog.exe s 0 1000 1000.

Anyway, if the test build shows the values correctly, some boards of RX 6000 series doesn't work with pmlog feature. I will send a PR to LibreHardwareMonitor excluding these boards and use the old method. It's weird that pmlog worked for me with a RX 6750 XT.

Thanks for the test and feedback!

@iDumle
Copy link
Author

iDumle commented Apr 19, 2024

I ran pmlog again with the parameters you gave me and was able to see the output.
PMlogOutput

I hope it all helps, and your welcome I am glad I can be helpful, It's a fantastic software you guys are building, so just happy to be of aid.

@iDumle
Copy link
Author

iDumle commented Apr 19, 2024

I noticed that pmlog is detecting an edge temp sensor so I ran it in the background while running overwatch and pmlog is updating the temp correctly and it matches with the temp sensor in overwatch.
PmlogEdgeTempWorking

This should be the same sensor that v187 LibreHardwareMonitorLib.dll was trying to monitor with the new method right?
Not sure what's going on but I though i'd mention it.

@epinter
Copy link

epinter commented Apr 19, 2024

I noticed that pmlog is detecting an edge temp sensor so I ran it in the background while running overwatch and pmlog is updating the temp correctly and it matches with the temp sensor in overwatch. PmlogEdgeTempWorking

This should be the same sensor that v187 LibreHardwareMonitorLib.dll was trying to monitor with the new method right? Not sure what's going on but I though i'd mention it.

That's weird. Something is breaking the pmlog inside the librehardwaremonitor and not in this sample. I think it's better to make RX 6000 to use old method to collect sensors.

@iDumle
Copy link
Author

iDumle commented Apr 19, 2024

That's weird. Something is breaking the pmlog inside the librehardwaremonitor and not in this sample. I think it's better to make RX 6000 to use old method to collect sensors.

The saying "If it ain't broken, don't fix it" comes to mind :) that said if you at some point decide to give the new method a once over again I'd be happy to test it out.

@axel-lebourhis
Copy link

I'm having the same kind of issue with my 7800 XT, sensors value are corrupted, so fancontrol looses its mind and ramps the fans all the way up. Went back to 186 and no more issue.

@epinter
Copy link

epinter commented Apr 21, 2024

I'm having the same kind of issue with my 7800 XT, sensors value are corrupted, so fancontrol looses its mind and ramps the fans all the way up. Went back to 186 and no more issue.

7800 XT doesn't have problems with the build LibreHardwareMonitor fancontrol 187 is based, actually the sensors of 7800 was fixed by that build. Try the build I posted above.

@epinter
Copy link

epinter commented Apr 21, 2024

@iDumle
Can you test this LHM build ? It's unchanged, using the new method to get the sensors. Just to be sure, check if all the sensors are stable and match amd adrenalin, or if you get something weird.

LibreHardwareMonitor-net472-nightly-4d6a755c.zip

@iDumle
Copy link
Author

iDumle commented Apr 21, 2024

@epinter
So I just ran LHM in the background while running overwatch, and as far as I can see the sensors are working fine and lines up with the one in overwatch and amd adrenalin.
LibHwMonOutput187base

So something seems to gets missaligned from LHM to Fan Control.

@epinter
Copy link

epinter commented Apr 21, 2024

@epinter So I just ran LHM in the background while running overwatch, and as far as I can see the sensors are working fine and lines up with the one in overwatch and amd adrenalin. LibHwMonOutput187base

So something seems to gets missaligned from LHM to Fan Control.

Thanks for the test! So LHM is working.

@epinter
Copy link

epinter commented Apr 21, 2024

@Rem0o LHM is working with 6650 too, like in my test with 6750. Any idea what can cause the problem ? Maybe an issue with this ?

@iDumle
Copy link
Author

iDumle commented Apr 21, 2024

Okay I properly should have done this from the start(feeling a little dumb I didn't do this to begin with), given the new test I did, I tried a complete redownload of v187 and imported my old config, and It seems to working as it should, so now I'm thinking something went wrong with auto updating from v186 to v187 rather then v187 not working.
CleanInstallOf187

@Rem0o
Copy link
Owner

Rem0o commented Apr 21, 2024

The downloadable LHM is not the most up to date version. I'm using the latest commit since I'm compiling it myself.
@iDumle

@epinter
Copy link

epinter commented Apr 21, 2024

@Rem0o The build I told @iDumle to download (LibreHardwareMonitor-net472-nightly-4d6a755c.zip) is using the latest commit. For reference, can be downloaded here.

@iDumle
Copy link
Author

iDumle commented Apr 21, 2024

I'll try and make a detailed timeline of my use of the software, maybe it can shine some light on something.

  • I first downloaded FanControl after seeing Jays2cent featuring it, this was done on 14/02-2023 03:58 and was version v146
  • I unpacked it to its own folder inside my download folder, and it has not been moved afterwards.
  • I have since not interacted with the GitHub repo up until now and all subsequent updates has happened inside the software it self when prompted.
  • All updates have been without a problem, until the update from v186 to v187, immediately after the update I noted the odd behaviour, did the restarts and documented the behaviour and created this issue.
  • Everything else is laid out in this post.

What the actual root course for the odd behaviour I experienced is, I can only speculated on, It could be a windows thing, or maybe related to the fact my first version was v146 and all updated was done in software and never redownloaded from repo, or it might be that I was just unlucky.

What I do know is that I have done a new install from a newly download zip of v187 from the Github repo and unpacked it to its own folder on my desktop and loaded up my old config file, and it's working fine.

I don't know if this is helpful at all but now you have it.

@iDumle
Copy link
Author

iDumle commented Apr 22, 2024

Something is still wonky on the v187 release for me, I just had a cold boot with the new fresh install that I though was working, and now the temp are frozen and incorrect.
SomethingIsStillWonkyOn187
Edit
@epinter
After spotting this behaviour in FanControl I ran the LHM(LibreHardwareMonitor-net472-nightly-4d6a755c.zip) you gave me, and It's broken.
SomethingIsStillWonkyOn187LHM

@axel-lebourhis
Copy link

I'm having the same kind of issue with my 7800 XT, sensors value are corrupted, so fancontrol looses its mind and ramps the fans all the way up. Went back to 186 and no more issue.

7800 XT doesn't have problems with the build LibreHardwareMonitor fancontrol 187 is based, actually the sensors of 7800 was fixed by that build. Try the build I posted above.

I tested the build you posted, here is the result:
image

In FanControl:
image

@epinter
Copy link

epinter commented Apr 22, 2024

I don't understand... I'm running on 7800 xt for 2 weeks, no problems. I will look the code again, but I don't know why there's invalid data with some cards. What I see in common on @iDumle and @axel-lebourhis screenshots is the "GPU Memory" temperature added. The only temperature sensors on these cards are "GPU Core" and "GPU Hot Spot", as far as I know.

rx6750xt-4d6a755c

rx7800xt-4d6a755c

@epinter
Copy link

epinter commented Apr 23, 2024

@axel-lebourhis When you have time, can you try to run and tell me what happens ? I'm trying to isolate some code path.

LibreHardwareMonitor-net472-test1-nooverdrive-20240422.zip

@epinter
Copy link

epinter commented Apr 23, 2024

@iDumle
Copy link
Author

iDumle commented Apr 23, 2024

@epinter I tested the LHM and I have two results for you.

This one I did right after downloading your LHM, and it's clearly broken, the computer had been running for about 2 hours.
LHM-test2-getsensor_factor-20240423

I then did a reboot and had LHM boot with window, and I got a different result, ran overwatch and it responded fine.
LHM-test2-getsensor_factor-20240423-TestAfterBoot-StartWithWindows

I then did a third reboot with start with windows disabled and it mirrored the 2nd test and was working fine too (I didn't screenshot it).

I hope this is informative enough, please tell me if you want additional data.

@iDumle
Copy link
Author

iDumle commented May 5, 2024

Here, this build should work too. https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/actions/runs/8954631686

Can confirm this one works too.
LHMBuildFromGithub8954631686-05-05-2024

@epinter
Copy link

epinter commented May 5, 2024

@iDumle
Thank you very much for all your help! Now I believe the problem was solved, I think soon the commit will be merged.

@iDumle
Copy link
Author

iDumle commented May 5, 2024

@iDumle Thank you very much for all your help! Now I believe the problem was solved, I think soon the commit will be merged.

You're welcome I'm happy to help, I do hope this will be it for this issue, and thank you for doing the work, your contribution enable us amd users to keep enjoying the software.

@Rem0o
Copy link
Owner

Rem0o commented May 8, 2024

Latest LHMLib is in V189

@GEKonPC
Copy link

GEKonPC commented May 11, 2024

image
Hi sorry to disappoint but V189 is still broken :/

@epinter
Copy link

epinter commented May 12, 2024

Hi sorry to disappoint but V189 is still broken :/

@GEKonPC

I think this could be something in LHM yet. If you want to help, I can send you a build of LHM for you to collect the report. I see you have a RX 5700, we didn't test with series 5000 yet, maybe there's some adjust to make.

@GEKonPC
Copy link

GEKonPC commented May 12, 2024

Sure, send it but I will need instructions :D not as well versed in this as the other guy

@epinter
Copy link

epinter commented May 12, 2024

Sure, send it but I will need instructions :D not as well versed in this as the other guy

:)
Don't worry, do what you can, it's nothing complex... Unzip, run LibreHardwareMonitor.exe, go to menu File, click Save Report. Then just drag and drop the txt file here.

Screenshot 2024-05-12 133107
net472-20240511-test5000.zip

@ouyangyiluo
Copy link

ouyangyiluo commented May 12, 2024

Latest LHMLib is in V189

2024-05-13 034151

@GEKonPC
Copy link

GEKonPC commented May 13, 2024

WhenItsNOTWorking.Report.txt
WhenItsWorking.Report.txt
so i got 2 reports.. one where it was working and one where fan control failed to find 2 sensors and is reporting bonkers numbers again but looking at the LHM there it is reporting fine
image

@epinter
Copy link

epinter commented May 13, 2024

so i got 2 reports.. one where it was working and one where fan control failed to find 2 sensors and is reporting bonkers numbers again but looking at the LHM there it is reporting fine

Great, thanks for the reports. Everything looks ok on both reports, the sensors found are consistent, also the data in each of the sensors. I don't see problem in LHM, let's way @Rem0o update FanControl. I was looking specially two things:
The line "Sensors Supported: 21" in the repeating adapters found, 7 in your case. This line is added to report when PMLog(method added recently to read data) sensor is found, all adapters have the same number of sensors, which is good. And the sensors data, in the build I sent to you I added a text in the report to identify if the sensor data comes from old method or new, and in your case both methods fetch the same data.

@Rem0o
Copy link
Owner

Rem0o commented May 13, 2024

@epinter

FanControl V189 uses the latest commit
of LHM : LibreHardwareMonitor/LibreHardwareMonitor@e624437

@epinter
Copy link

epinter commented May 13, 2024

@epinter

FanControl V189 uses the latest commit of LHM : LibreHardwareMonitor/LibreHardwareMonitor@e624437

md5 of V188 and V189 are the same:

 md5sum *net_4_8/LibreHardwareMonitorLib.dll *net_8_0/LibreHardwareMonitorLib.dll
0e9ac2177593b17589ec567509d422a3  FanControl_188_net_4_8/LibreHardwareMonitorLib.dll
0e9ac2177593b17589ec567509d422a3  FanControl_189_net_4_8/LibreHardwareMonitorLib.dll
e4ec0737857d4c66935f9033890ae907  FanControl_188_net_8_0/LibreHardwareMonitorLib.dll
e4ec0737857d4c66935f9033890ae907  FanControl_189_net_8_0/LibreHardwareMonitorLib.dll

EDIT: By the file properties, the commit is the LibreHardwareMonitor/LibreHardwareMonitor@331d334. After this commit, I've sent 3 updates.

@iDumle
Copy link
Author

iDumle commented May 13, 2024

@Rem0o @epinter
Can confirm that LHM in FanControl is broken, was just about to make a post about it because my FanControl was at 0 temp when I started my computer just now, I then tried to swap the DLL in to LHM to see if it would show the same, and it showed it was broken, where as the last version that epinter send me was working fine.

Here is Fan Control menu.
FanControlMenu13-05-2024

Here is the LHM that epinter send me last.
LHMLastFixedVersionFromGit13-05-2024

And here is the LHM running the DLL from fancontrol
LHMFanControlDLL13-05-2024

@Rem0o
Copy link
Owner

Rem0o commented May 13, 2024

Checked the hashes and I'm confused as to what's going on.
image

Will figure out what's going on.

EDIT:

Fun times, had 2 local copies of the repo. Build pipeline used one, but the one I was referring to was the other 🤦‍♂️

@Rem0o
Copy link
Owner

Rem0o commented May 14, 2024

@User999999991
Copy link

V190 lost my GPU sensors. Swapping in LHM files from V189 fixed it.

@epinter
Copy link

epinter commented May 14, 2024

V190 lost my GPU sensors. Swapping in LHM files from V189 fixed it.

What gpu you have ? Users tested AMD RX series 5000, 6000 and 7000, all of them worked as far as I know.

Test the latest nightly build of LHM: https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/actions/runs/8958653644/artifacts/1474376834

@Rem0o
Copy link
Owner

Rem0o commented May 14, 2024

@epinter

Seems like RX 400 and 500 series.

#2509

@epinter
Copy link

epinter commented May 14, 2024

@epinter

Seems like RX 400 and 500 series.

#2509

@Rem0o
I think this cards might not work with PMLog. If it's the case, it's easy to exclude them and make them use old sensors. This would be a good test for ADLX too.

@User999999991
Copy link

My card is an AMD R9 280x. Adding that latest LHM to Fan Control had the same problem as v190. Got the same could not find expected sensors error again. The ADLX plugin didn't help either.

@epinter
Copy link

epinter commented May 14, 2024

My card is an AMD R9 280x. Adding that latest LHM to Fan Control had the same problem as v190. Got the same could not find expected sensors error again. The ADLX plugin didn't help either.

Must be the same problem of RX 400 and RX 500 series. Try this build of LHM, tell me if it works. Download links at the end of the page, LibreHardwareMonitor-net472 is full LHM, others are the libraries for other .net versions.

https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/actions/runs/9085676231

@User999999991
Copy link

Must be the same problem of RX 400 and RX 500 series. Try this build of LHM, tell me if it works. Download links at the end of the page, LibreHardwareMonitor-net472 is full LHM, others are the libraries for other .net versions.

https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/actions/runs/9085676231

That LHM works again!

@epinter
Copy link

epinter commented May 14, 2024

Must be the same problem of RX 400 and RX 500 series. Try this build of LHM, tell me if it works. Download links at the end of the page, LibreHardwareMonitor-net472 is full LHM, others are the libraries for other .net versions.
https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/actions/runs/9085676231

That LHM works again!

Nice , good to hear that!

@Rem0o
Copy link
Owner

Rem0o commented May 18, 2024

V191 includes the changes made by @epinter to LHM.

@iDumle if that fixes it, please close this issue.

@iDumle
Copy link
Author

iDumle commented May 18, 2024

As far as I'm considered my issue has been resolved with v190, and if the new ones in this issue have made their own, I will close this.
Thanks for all the work. :)

@iDumle iDumle closed this as completed May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests