Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apparmor: Allow confined runc to kill containers #47749

Merged
merged 1 commit into from
May 2, 2024

Commits on Apr 24, 2024

  1. apparmor: Allow confined runc to kill containers

    /usr/sbin/runc is confined with "runc" profile[1] introduced in AppArmor
    v4.0.0. This change breaks stopping of containers, because the profile
    assigned to containers doesn't accept signals from the "runc" peer.
    AppArmor >= v4.0.0 is currently part of Ubuntu Mantic (23.10) and later.
    
    In the case of Docker, this regression is hidden by the fact that
    dockerd itself sends SIGKILL to the running container after runc fails
    to stop it. It is still a regression, because graceful shutdowns of
    containers via "docker stop" are no longer possible, as SIGTERM from
    runc is not delivered to them. This can be seen in logs from dockerd
    when run with debug logging enabled and also from tracing signals with
    killsnoop utility from bcc[2] (in bpfcc-tools package in Debian/Ubuntu):
    
      Test commands:
    
        root@cloudimg:~# docker run -d --name test redis
        ba04c137827df8468358c274bc719bf7fc291b1ed9acf4aaa128ccc52816fe46
        root@cloudimg:~# docker stop test
    
      Relevant syslog messages (with wrapped long lines):
    
        Apr 23 20:45:26 cloudimg kernel: audit:
          type=1400 audit(1713905126.444:253): apparmor="DENIED"
          operation="signal" class="signal" profile="docker-default" pid=9289
          comm="runc" requested_mask="receive" denied_mask="receive"
          signal=kill peer="runc"
        Apr 23 20:45:36 cloudimg dockerd[9030]:
          time="2024-04-23T20:45:36.447016467Z"
          level=warning msg="Container failed to exit within 10s of kill - trying direct SIGKILL"
          container=ba04c137827df8468358c274bc719bf7fc291b1ed9acf4aaa128ccc52816fe46
          error="context deadline exceeded"
    
      Killsnoop output after "docker stop ...":
    
        root@cloudimg:~# killsnoop-bpfcc
        TIME      PID      COMM             SIG  TPID     RESULT
        20:51:00  9631     runc             3    9581     -13
        20:51:02  9637     runc             9    9581     -13
        20:51:12  9030     dockerd          9    9581     0
    
    This change extends the docker-default profile with rules that allow
    receiving signals from processes that run confined with either runc or
    crun profile (crun[4] is an alternative OCI runtime that's also confined
    in AppArmor >= v4.0.0, see [1]). It is backward compatible because the
    peer value is a regular expression (AARE) so the referenced profile
    doesn't have to exist for this profile to successfully compile and load.
    
    Note that the runc profile has an attachment to /usr/sbin/runc. This is
    the path where the runc package in Debian/Ubuntu puts the binary. When
    the docker-ce package is installed from the upstream repository[3], runc
    is installed as part of the containerd.io package at /usr/bin/runc.
    Therefore it's still running unconfined and has no issues sending
    signals to containers.
    
    [1] https://gitlab.com/apparmor/apparmor/-/commit/2594d936
    [2] https://github.com/iovisor/bcc/blob/master/tools/killsnoop.py
    [3] https://download.docker.com/linux/ubuntu
    [4] https://github.com/containers/crun
    
    Signed-off-by: Tomáš Virtus <nechtom@gmail.com>
    woky committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    5ebe2c0 View commit details
    Browse the repository at this point in the history