-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logind: lxc payload no longer ran in scope under root user session #32929
Labels
login
lxc/lxd
regression ⚠️
A bug in something that used to work correctly and broke through some recent commit
Milestone
Comments
bluca
added
login
regression ⚠️
A bug in something that used to work correctly and broke through some recent commit
lxc/lxd
labels
May 19, 2024
Found the issue: 5099a50 switched from |
bluca
added a commit
to bluca/systemd
that referenced
this issue
May 20, 2024
When running inside an LXC container the 'su' process will not be part of any unit or slice. manager_get_user_by_pid() which was used until v255 (included) does not fail if it cannot find a unit/slice, but simply returns 'not found'. Do the same in manager_get_session_by_pidref(). This was not detected as Semaphore CI does not reboot the testbed before the logind test, so the session is started by the old logind from the base distro, instead of the one being tested. Follow-up for 8494f56 Follow-up for 5099a50 Fixes systemd#32929
bluca
added a commit
to bluca/systemd
that referenced
this issue
May 20, 2024
When running inside an LXC container the 'su' process will not be part of any unit or slice. manager_get_user_by_pid() which was used until v255 (included) does not fail if it cannot find a unit/slice, but simply returns 'not found'. Do the same in manager_get_session_by_pidref(). This was not detected as Semaphore CI does not reboot the testbed before the logind test, so the session is started by the old logind from the base distro, instead of the one being tested. Follow-up for 8494f56 Follow-up for 5099a50 Fixes systemd#32929
bluca
added a commit
to bluca/systemd-stable
that referenced
this issue
May 26, 2024
When running inside an LXC container the 'su' process will not be part of any unit or slice. manager_get_user_by_pid() which was used until v255 (included) does not fail if it cannot find a unit/slice, but simply returns 'not found'. Do the same in manager_get_session_by_pidref(). This was not detected as Semaphore CI does not reboot the testbed before the logind test, so the session is started by the old logind from the base distro, instead of the one being tested. Follow-up for 8494f56 Follow-up for 5099a50 Fixes systemd/systemd#32929 (cherry picked from commit eb56b56)
bluca
added a commit
to bluca/systemd-stable
that referenced
this issue
May 26, 2024
When running inside an LXC container the 'su' process will not be part of any unit or slice. manager_get_user_by_pid() which was used until v255 (included) does not fail if it cannot find a unit/slice, but simply returns 'not found'. Do the same in manager_get_session_by_pidref(). This was not detected as Semaphore CI does not reboot the testbed before the logind test, so the session is started by the old logind from the base distro, instead of the one being tested. Follow-up for 8494f56 Follow-up for 5099a50 Fixes systemd/systemd#32929 (cherry picked from commit eb56b56)
keszybz
pushed a commit
to systemd/systemd-stable
that referenced
this issue
May 27, 2024
When running inside an LXC container the 'su' process will not be part of any unit or slice. manager_get_user_by_pid() which was used until v255 (included) does not fail if it cannot find a unit/slice, but simply returns 'not found'. Do the same in manager_get_session_by_pidref(). This was not detected as Semaphore CI does not reboot the testbed before the logind test, so the session is started by the old logind from the base distro, instead of the one being tested. Follow-up for 8494f56 Follow-up for 5099a50 Fixes systemd/systemd#32929 (cherry picked from commit eb56b56)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
login
lxc/lxd
regression ⚠️
A bug in something that used to work correctly and broke through some recent commit
5099a50 (the first commit of PR #30884 ) introduced a regression that went unnoticed due to a test bug.
Previously, a payload execute by lxc, such as an autopkgtest test runner, would be ran in a scope under the root user session -
$XDG_SESSION_ID
was set, and /proc/self/cgroup would show:0::/user.slice/user-0.slice/session-9.scope
Since after that commit, there is no root user session (or any session) at all,
$XDG_SESSION_ID
is not set, and proc/self/cgroup now shows:0::/.lxc
In the logs we can see:
From the full debug log, it's
CreateSessionWithPIDFD
that is returning ENXIO. Adding a fallback toCreateSession
doesn't help, that also fails with ENXIO.CreateSession gets called with:
create_session(uid=0, leader_pid=0, leader_pidfd=10, service=su, type=unspecified, class=background, desktop=, cseat=, vtnr=0, tty=, display=, remote=0, remote_user=root, remote_host=, flags=0)
While in the successful case the session is created as expected.
The test bug was that the testbed booted with the old systemd+logind version, whatever was in the base distribution, and then would install the new packages at runtime, so the lxc payload was already running and assigned to a session by the code from the distribution. Fixing the test by rebooting after installing the code built from the branch is enough to make the issue show up.
For some reason this is not a problem under qemu, no idea why. I can reproduce this on the Semaphore CI very easily.
Successful run, at the commit before the one mentioned above:
https://the-real-systemd.semaphoreci.com/jobs/3a4a365c-4851-4a12-b9a0-85d81412a594
Failing run, at the commit mentioned above:
https://the-real-systemd.semaphoreci.com/jobs/70dd30b9-fe83-4adc-b179-b37d06df6ad7
Failing run on latest main, with full logind debug level logs:
https://the-real-systemd.semaphoreci.com/jobs/53096557-7f4c-4100-8720-cdd031f93536
To reproduce, enable Semaphore CI on a Github fork, then cherry pick this commit that ensures the right build and test options are used - a shortened build, that only runs the logind test, with the reboot fix:
bluca@3c371ec
The text was updated successfully, but these errors were encountered: