Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nesting apptainer instance start fails inside of container on cgroups-v2 capable host #2164

Closed
DrDaveD opened this issue Apr 19, 2024 · 3 comments · Fixed by #2220
Closed
Assignees
Milestone

Comments

@DrDaveD
Copy link
Contributor

DrDaveD commented Apr 19, 2024

Version of Apptainer

apptainer-1.3.0

Expected behavior

I expect to be able to run an apptainer instance nested inside of another container run by apptainer

Actual behavior

I get these messages and a failure:

INFO:    Cleanup error: while stopping driver for /scratch/tmp/app/x86_64/var/lib/apptainer/mnt/session/rootfs: squashfuse_ll exited
ERROR:   container cleanup failed: no instance found with name tst
FATAL:   container creation failed: while applying cgroups config: while creating cgroup manager: systemd not running on this host, cannot use systemd cgroups manager

FATAL:   while executing starter: failed to start instance: while running /scratch/tmp/app/x86_64/libexec/apptainer/bin/starter: exit status 255

Steps to reproduce this behavior

On an EL9 host, where cgroups v2 is available, use install-unprivileged.sh to install apptainer-1.3.0 into app subdirectory and do:

$ app/bin/apptainer shell docker://rockylinux:9
INFO:    Using cached SIF image
Apptainer> app/bin/apptainer instance start docker://rockylinux:9 tst

What OS/distro are you running

EL9

How did you install Apptainer

install-unprivileged.sh

@DrDaveD
Copy link
Contributor Author

DrDaveD commented Apr 19, 2024

A workaround is to do unset DBUS_SESSION_BUS_ADDRESS before doing instance start.

@DrDaveD
Copy link
Contributor Author

DrDaveD commented May 7, 2024

Perhaps if systemd is not available but DBUS_SESSION_BUS_ADDRESS is set it should just be a warning and not a fatal error.

@JasonYangShadow
Copy link
Member

I can reproduce this same issue on RockyLinux 9

[vagrant@localhost ~]$ curl -s https://raw.githubusercontent.com/apptainer/apptainer/main/tools/install-unprivileged.sh |     bash -s - app
Extracting https://linux-mirrors.fnal.gov/linux/fedora/epel/7/x86_64/Packages/a/apptainer-1.3.1-1.el7.x86_64.rpm
242542 blocks
Extracting https://linux-mirrors.fnal.gov/linux/centos/7/os/x86_64/Packages/lzo-2.06-8.el7.x86_64.rpm
326 blocks
Extracting https://linux-mirrors.fnal.gov/linux/centos/7/os/x86_64/Packages/squashfs-tools-4.3-0.21.gitaae0aff4.el7.x86_64.rpm
430 blocks
Extracting https://linux-mirrors.fnal.gov/linux/centos/7/os/x86_64/Packages/libseccomp-2.3.1-4.el7.x86_64.rpm
597 blocks
Extracting https://linux-mirrors.fnal.gov/linux/fedora/epel/7/x86_64/Packages/f/fakeroot-libs-1.26-4.el7.x86_64.rpm
218 blocks
Extracting https://linux-mirrors.fnal.gov/linux/fedora/epel/7/x86_64/Packages/f/fakeroot-1.26-4.el7.x86_64.rpm
296 blocks
Extracting https://linux-mirrors.fnal.gov/linux/fedora/epel/7/x86_64/Packages/l/libzstd-1.5.5-1.el7.x86_64.rpm
1552 blocks
Extracting https://linux-mirrors.fnal.gov/linux/fedora/epel/7/x86_64/Packages/f/fuse3-libs-3.6.1-2.el7.x86_64.rpm
543 blocks
Patching fakeroot-sysv to make it relocatable
Creating bin/apptainer and bin/singularity
Installation complete in app
[vagrant@localhost ~]$ app/bin/apptainer shell docker://rockylinux:9
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Copying blob 489e1be6ce56 skipped: already exists
Copying config b6259d707a done   |
Writing manifest to image destination
2024/05/10 03:28:08  info unpack layer: sha256:489e1be6ce56f590a5a31bdf814671cac006421930c1175cb62e1763bf51a3f9
INFO:    Creating SIF file...
Apptainer> app/bin/apptainer instance start docker://rockylinux:9 tst
INFO:    Using cached SIF image
INFO:    Terminating squashfuse_ll after timeout
INFO:    Timeouts can be caused by a running background process
ERROR:   container cleanup failed: no instance found with name tst
FATAL:   container creation failed: while applying cgroups config: while creating cgroup manager: systemd not running on this host, cannot use systemd cgroups manager

FATAL:   while executing starter: failed to start instance: while running /home/vagrant/app/x86_64/libexec/apptainer/bin/starter: exit status 255
Apptainer>
[vagrant@localhost ~]$ cat /etc/os-release
NAME="Rocky Linux"
VERSION="9.4 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.4 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2032-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.4"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.4"
[vagrant@localhost ~]$ app/bin/apptainer version
1.3.1-1.el7
[vagrant@localhost ~]$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants