/usr/sbin/runc is confined with "runc" profile[1] introduced in AppArmor
v4.0.0. This change breaks stopping of containers, because the profile
assigned to containers doesn't accept signals from the "runc" peer.
AppArmor >= v4.0.0 is currently part of Ubuntu Mantic (23.10) and later.
In the case of Docker, this regression is hidden by the fact that
dockerd itself sends SIGKILL to the running container after runc fails
to stop it. It is still a regression, because graceful shutdowns of
containers via "docker stop" are no longer possible, as SIGTERM from
runc is not delivered to them. This can be seen in logs from dockerd
when run with debug logging enabled and also from tracing signals with
killsnoop utility from bcc[2] (in bpfcc-tools package in Debian/Ubuntu):
Test commands:
root@cloudimg:~# docker run -d --name test redis
ba04c137827df8468358c274bc719bf7fc291b1ed9acf4aaa128ccc52816fe46
root@cloudimg:~# docker stop test
Relevant syslog messages (with wrapped long lines):
Apr 23 20:45:26 cloudimg kernel: audit:
type=1400 audit(1713905126.444:253): apparmor="DENIED"
operation="signal" class="signal" profile="docker-default" pid=9289
comm="runc" requested_mask="receive" denied_mask="receive"
signal=kill peer="runc"
Apr 23 20:45:36 cloudimg dockerd[9030]:
time="2024-04-23T20:45:36.447016467Z"
level=warning msg="Container failed to exit within 10s of kill - trying direct SIGKILL"
container=ba04c137827df8468358c274bc719bf7fc291b1ed9acf4aaa128ccc52816fe46
error="context deadline exceeded"
Killsnoop output after "docker stop ...":
root@cloudimg:~# killsnoop-bpfcc
TIME PID COMM SIG TPID RESULT
20:51:00 9631 runc 3 9581 -13
20:51:02 9637 runc 9 9581 -13
20:51:12 9030 dockerd 9 9581 0
This change extends the docker-default profile with rules that allow
receiving signals from processes that run confined with either runc or
crun profile (crun[4] is an alternative OCI runtime that's also confined
in AppArmor >= v4.0.0, see [1]). It is backward compatible because the
peer value is a regular expression (AARE) so the referenced profile
doesn't have to exist for this profile to successfully compile and load.
Note that the runc profile has an attachment to /usr/sbin/runc. This is
the path where the runc package in Debian/Ubuntu puts the binary. When
the docker-ce package is installed from the upstream repository[3], runc
is installed as part of the containerd.io package at /usr/bin/runc.
Therefore it's still running unconfined and has no issues sending
signals to containers.
[1] https://gitlab.com/apparmor/apparmor/-/commit/2594d936
[2] https://github.com/iovisor/bcc/blob/master/tools/killsnoop.py
[3] https://download.docker.com/linux/ubuntu
[4] https://github.com/containers/crun
Signed-off-by: Tomáš Virtus <nechtom@gmail.com>