Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel BUG: io_uring openat triggers audit reference count underflow #3962

Open
daclash opened this issue Nov 29, 2023 · 1 comment
Open

Comments

@daclash
Copy link

daclash commented Nov 29, 2023

Description

I encountered a kernel bug that occurs during io_uring openat audit processing. I have a kernel patch that was accepted into the upstream kernel as well as the v6.6, v6.5.9, and v6.1.60 releases. The bug was first introduced in the upstream v5.16 kernel.

Will the linuxkit kernel be upgraded to the upstream v6.5 or v6.6 soon? If not then is it possible to cherry pick the fix?

Docker Desktop 4.24.2 (124339) uses 6.4.16-linuxkit which has the defect.

The upstream commit is:

03adc61edad49e1bbecfb53f7ea5d78f398fe368

The upstream patch thread is:

https://lore.kernel.org/audit/20231012215518.GA4048@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/T/#u

The maintainer pull request thread is:

https://lore.kernel.org/lkml/20231019-kampfsport-metapher-e5211d7be247@brauner

The commit log message is:

commit 03adc61edad49e1bbecfb53f7ea5d78f398fe368
Author: Dan Clash daclash@linux.microsoft.com
Date: Thu Oct 12 14:55:18 2023 -0700

audit,io_uring: io_uring openat triggers audit reference count underflow

An io_uring openat operation can update an audit reference count
from multiple threads resulting in the call trace below.

A call to io_uring_submit() with a single openat op with a flag of
IOSQE_ASYNC results in the following reference count updates.

These first part of the system call performs two increments that do not race.

do_syscall_64()
  __do_sys_io_uring_enter()
    io_submit_sqes()
      io_openat_prep()
        __io_openat_prep()
          getname()
            getname_flags() /* update 1 (increment) */
              __audit_getname() /* update 2 (increment) */

The openat op is queued to an io_uring worker thread which starts the
opportunity for a race. The system call exit performs one decrement.

do_syscall_64()
  syscall_exit_to_user_mode()
    syscall_exit_to_user_mode_prepare()
      __audit_syscall_exit()
        audit_reset_context()
           putname() /* update 3 (decrement) */

The io_uring worker thread performs one increment and two decrements.
These updates can race with the system call decrement.

io_wqe_worker()
  io_worker_handle_work()
    io_wq_submit_work()
      io_issue_sqe()
        io_openat()
          io_openat2()
            do_filp_open()
              path_openat()
                __audit_inode() /* update 4 (increment) */
            putname() /* update 5 (decrement) */
        __audit_uring_exit()
          audit_reset_context()
            putname() /* update 6 (decrement) */

The fix is to change the refcnt member of struct audit_names
from int to atomic_t.

kernel BUG at fs/namei.c:262!
Call Trace:
...
 ? putname+0x68/0x70
 audit_reset_context.part.0.constprop.0+0xe1/0x300
 __audit_uring_exit+0xda/0x1c0
 io_issue_sqe+0x1f3/0x450
 ? lock_timer_base+0x3b/0xd0
 io_wq_submit_work+0x8d/0x2b0
 ? __try_to_del_timer_sync+0x67/0xa0
 io_worker_handle_work+0x17c/0x2b0
 io_wqe_worker+0x10a/0x350

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/lkml/MW2PR2101MB1033FFF044A258F84AEAA584F1C9A@MW2PR2101MB1033.namprd21.prod.outlook.com/
Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring")
Signed-off-by: Dan Clash <daclash@linux.microsoft.com>
Link: https://lore.kernel.org/r/20231012215518.GA4048@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>

Steps to reproduce the issue:

The pre-patch discussion thread contains a user mode test program that reproduces the issue:

https://lore.kernel.org/io-uring/MW2PR2101MB1033FFF044A258F84AEAA584F1C9A@MW2PR2101MB1033.namprd21.prod.outlook.com/T/#u

Describe the results you received:

The user mode test program hangs and the following dmesg log is present:

    kernel BUG at fs/namei.c:262!
    Call Trace:
    ...
     ? putname+0x68/0x70
     audit_reset_context.part.0.constprop.0+0xe1/0x300
     __audit_uring_exit+0xda/0x1c0
     io_issue_sqe+0x1f3/0x450
     ? lock_timer_base+0x3b/0xd0
     io_wq_submit_work+0x8d/0x2b0
     ? __try_to_del_timer_sync+0x67/0xa0
     io_worker_handle_work+0x17c/0x2b0
     io_wqe_worker+0x10a/0x350

Describe the results you expected:

The test program should complete.

Additional information you deem important (e.g. issue happens only occasionally):

The test program reliably reproduces the problem.

@deitch
Copy link
Collaborator

deitch commented Nov 30, 2023

I do not do the kernel maintenance here, that is more @rn who understands the ins and outs. Also @djs55 who I think knows the relationship between the Docker Desktop and the kernels here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants