Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dual RX6600s - Only one GPU detected #313

Open
2 tasks done
GIJack opened this issue Apr 26, 2024 · 8 comments
Open
2 tasks done

Dual RX6600s - Only one GPU detected #313

GIJack opened this issue Apr 26, 2024 · 8 comments

Comments

@GIJack
Copy link

GIJack commented Apr 26, 2024

Checklist

Bug description

I have a pair of GPUs, but only one is detected

from the amdcovc tool that sees both GPUs

$ amdcovc 
Adapter 0: PCI 11:0:0: Device 0000
  Core: 0 MHz, Mem: 96 MHz, CoreOD: 0, MemOD: 0, Vddc: 6 mV
  SOC: 640 MHz, DCEF: 480 MHz, FClock: 942 MHz
  PerfCtrl: auto, Load: 0%, MemLoad: 0%
  Temp: 32°C, T2: 32°C, T3: 32°C, Fan: 34.1176%
  Power: 3 W (cap: 120 W)
  Core Clocks: 0 0
  Memory Clocks: 96 541 675 875
  SOC Clocks: 417 640 1200
  DCEF Clocks: 417 480 1200
  F Clocks: 500 942 1801
Adapter 1: PCI 68:0:0: Device 0000
  Core: 800 MHz, Mem: 875 MHz, CoreOD: 0, MemOD: 0, Vddc: 718 mV
  SOC: 800 MHz, DCEF: 685 MHz, FClock: 1221 MHz
  PerfCtrl: auto, Load: 3%, MemLoad: 0%
  Temp: 39°C, T2: 39°C, T3: 44°C, Fan: 42.3529%
  Power: 18 W (cap: 100 W)
  Core Clocks: 500 800 2750
  Memory Clocks: 96 541 675 875
  SOC Clocks: 417 800 1200
  DCEF Clocks: 417 685 1200
  F Clocks: 500 1221 1801

your tool only sees one:

$ lact cli list-gpus
1002:73FF-1EAE:6505-0000:0b:00.0 (Navi 23 [Radeon RX 6600/6600 XT/6600M

System info

- LACT version:
$ pacman -Q lact
lact 0.5.4-2

- GPU model:
RX6600

- Kernel version:
$ uname -a
Linux iron 6.8.7-hardened1-2-hardened #1 SMP PREEMPT_DYNAMIC Wed, 17 Apr 2024 22:21:16 +0000 x86_64 GNU/Lin

- Distribution:
Arch Linux
@GIJack GIJack changed the title RX6600 - Only one GPU detected Dual RX6600s - Only one GPU detected Apr 26, 2024
@ilya-zlobintsev
Copy link
Owner

Could you include a debug snapshot?

@GIJack
Copy link
Author

GIJack commented Apr 29, 2024

LACT-sysfs-snapshot-20240428-184224.tar.gz

Debug snapshot

@ilya-zlobintsev
Copy link
Owner

The snapshot only contains a single GPU in /sys, which is weird. Could you show the output of

ls -la /sys/class/drm/

And also: does restarting the service (sudo systemctl restart lactd) change anything?

@GIJack
Copy link
Author

GIJack commented May 6, 2024

The snapshot only contains a single GPU in /sys, which is weird. Could you show the output of

ls -la /sys/class/drm/

And also: does restarting the service (sudo systemctl restart lactd) change anything?

it does, weird. But it doesn't see both of them as enabled.

@ilya-zlobintsev
Copy link
Owner

By "it does" - do you mean that both GPUs are detected in LACT? And what do you mean "doesn't seem them as enabled"?

@GIJack
Copy link
Author

GIJack commented May 12, 2024

yes, when lact is restarted when the system is running, both GPUs are found. When it runs on boot only one is.

@ilya-zlobintsev
Copy link
Owner

This seems to be another manifestation of the issue with LACT starting too early in the boot process, before all the sysfs entries are initialized. The current logic waits for 10 seconds since the startup of the system plus 1 GPU available, which I guess in your case isn't entirely correct.

I'll try to see if there's a way to make it more reliable for multi-gpu systems

@ilya-zlobintsev
Copy link
Owner

ea63322 should help with this. Please update to the latest commit, set log_level to debug in /etc/lact/config.yaml and tell me if this solves the problem. If it doesn't, then post the lact startup log from journalctl -u lactd -e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants