New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logind CanGraphical
state change only after DRM driver init
#32509
Comments
My idea also requires kernel changes in its current proposal. Since the original idea I did have a different idea how it could be done. Logind can look whether /dev/dri/card0 has been removed. This happens when amdgpu takes over the framebuffer. |
I assume there would have to be a timeout for that, so that systems, like qemu VMs using the cirrus driver, or other obscure hardware that need simpledrm to work for that though right? Otherwise (unless I am missing something) simpledrm systems will never go to CanGraphical? |
That's a good point on this gap you identified. Perhaps within logind the equivalent of |
Yeah. simpledrm is a fallback driver. It's for hardware that doesn't have its own mode setting drivers. It seems like it can actually be compiled as a module again (most distros compile it into the kernel), and it can in theory be loaded after if /dev/dri/card* fails to get created, the question still remains when to decide to load it, like is there no /dev/dri/card0 because the drivers haven't started yet, or is there no /dev/dri/card0 because the only video device is a cirrus card. Not sure how possible that is to determine that with user space I don't know how video card driver loading work in the kernel, is it possible to make simpledrm just load and hold the BIOS/sysbuf memory, not create the /dev/dri/card0, and if all those drivers probes fail, (or whatever happens), then it creates its /dev/dri/card0 ? It might make more sense than simpledrm loading, and then getting replaced, but maybe there is a reason why it does it in that order. ( |
So part of the problem is going to be "new" hardware. For example Hardware where amdgpu loads but isn't yet supported in that kernel version . You want simpledrm to load in this case and so any logic you would build around an assumption of vendor id or if a module is loaded falls apart. I think the best thing to do is run a settle sequence to decide when to CanGraphical. Then you will be sure that whatever should load is loaded and most importantly keep things simpler in logind. |
That makes sense, I wonder though if this is worth a thread on the LKML too first to see if the simpledrm dev has any insight? Or do you think it's probably not feasible to fix in the kernel? |
I personally don't see any way to do it in the kernel. But if you want to ask, go for it. Fwiw I am the one that did the delicate dance in amdgpu a year or so ago to make sure it smoothly handles the case of an unsupported GPU in a given kernel. Specifically it doesn't give up the framebuffer that simpledrm is using until it is sure it has all the driver code to support all IP blocks all the firmware that matches them. |
OK, so maybe it does have to be done in userspace then. Thinking more and there is the possibility for there to be some initrds that only have simpledrm in it, and then the actual drivers to be on the rootfs, and all that guessing based on available drivers would be wrong, I guess |
Not only possibly - that's exactly how Ubuntu works when you don't have disk encryption turned on. |
So I have been doing some testing, since I saw the SDDM issue about this. Last week, I was kind of confused by this I will admit, I thought the issue was /dev/dri/card1 was being created too soon, where because of all the firmware and stuff, it wasn't usable until a certain point. So ... I can I see this strategy fixing computers WITHOUT simpledrm. That could remove the possibility of a login manager, or the greeter display server saying "Oh heys! seat0 has no /dev/dri/card* devices! lets fail!", and this could also make it so that the greeter display servers don't start using the simpledrm device first, and then get it pulled out from under them. In my mind though, if it's possible for them to support it, I think the various display servers should better address if the simpledrm device they are using gets replaced. I see the original https://gitlab.gnome.org/GNOME/mutter/-/issues/2909 filed against Mutter, in the end it looks like it was addressed in GDM instead, even though it looks like the gnome-shell based greeter died. Testing on a VM with Results: I am wondering if bug reports should be made to the various display servers to support when the simpledrm device gets replaced? What do you think? |
You know what; that's a pretty similar contrived test that I was doing where I would let the display server startup and then load the driver later. But the problem is this is viewed as a "double hotplug" event, which isn't supported. |
Maybe it's at least worth asking kwin/wlroots? Their hotplug stuff could be different maybe? Or no? Also, this theory is kind of wacky, how much of simpledrm is dependent on that lower memory? Like just the display part right? |
It wouldn't hurt to ask, but I would be surprised if they handle hotplug for the primary display. That's tough to support!
Even if this was possible the problem you'll have is a phantom screen where the cursor isn't visible. Although it wouldn't crash the display server it's not the best experience.. |
Yeah, I didn't think so, but I meant that it just starts reporting itself as a device with no attached screens... |
The problem is you have no idea if a fully functional driver is "going" to load later. If it doesn't you want simpledrm to render. |
Well, my thought was that the simpledrm device acts as normal when booting, but when the usual GPU driver loads and replaces it, instead of disappearing, whether it's amdgpu, or i915, or nouveau, or virtio_gpu getting loaded, the /dev/dri/card0 device stays alive, but then starts reporting that there are no screens/CRTCs attached to where it is useless for displaying stuff, since now the real driver is handling them, just so that the display servers handles don't close... |
I guess you can raise this idea on dri-devel with the simpledrm maintainer for their thoughts. |
Done, hopefully I worded that correctly |
Component
systemd-logind
Is your feature request related to a problem? Please describe
Related to https://bugs.launchpad.net/linux/+bug/2063143 and all the bugs referenced within.
The problem is that as is right now, DRM gpu drivers are not guaranteed to have finished initializing by the time that login managers are called. This can result in unexpected behavior (eg: permanent black screen) if the DRM drives have not finished initializing when login managers start up.
This issue where DRM gpu drivers have not finished initializing is causing black screen on boot on kubuntu 24.04 on two separate test systems (AMD Framework 13 (amdgpu) and HP Spectre x360 (i915))
Describe the solution you'd like
As proposed by @jadahl and @superm1
If logind could hook up any async module loading hueristics to
CanGraphical
no login managers would need to worry about DRM drivers not having finished initialization.Something on the line of:
simpledrm
exports a new sysfs fileprobed
that has0
or1
. Defaults to0
.1
.logind
introduces logic to look fornomodeset
and if it's set thenCanGraphical()
returnsTRUE
every time.logind
introduces logic to look forprobed
file in DRI directory. If it's not found, then existing logic. If it's found, then calls toCanGraphical()
returnTRUE
only when the value is1
.Describe alternatives you've considered
Implement in every login manager a way to wait for DRM drivers to have finished initializing (like currently done in GDM because of this issue https://gitlab.gnome.org/GNOME/gdm/-/commit/895f765aa8cc5a9dd2901be65bcd638b8aa7c577)
The systemd version you checked that didn't have the feature you are asking for
255
The text was updated successfully, but these errors were encountered: