Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using this on debian 11 aka. bullseye is resulting in a non-systemd #4

Open
zerwes opened this issue Jan 21, 2022 · 49 comments
Open

using this on debian 11 aka. bullseye is resulting in a non-systemd #4

zerwes opened this issue Jan 21, 2022 · 49 comments

Comments

@zerwes
Copy link

zerwes commented Jan 21, 2022

System

Debian 11 aka. bullseye with the debian docker.io packages. (more details later)

Description

While trying to use the image directly in docker or via molecule, the image starts, but it seems it is not systemd enabled, resulting in failed test runs.

A self-brewn docker immage based on the official debian:bullseye works instead as expected.
But to be honest, docker is really not my area of expertise...

The issue seems to occur not only on the debian11 image, others like geerlingguy/docker-centos8-ansible, geerlingguy/docker-ubuntu2004-ansibleetc. seem affected too.

Steps to reproduce

$ docker run --detach --privileged --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro geerlingguy/docker-debian11-ansible:latest
0c103204a41a3dd1487ab70813ac5fd4480f3f9e904f70cbe0c8a2b02443d986
$ docker ps
CONTAINER ID   IMAGE                                        COMMAND                  CREATED          STATUS          PORTS     NAMES
0c103204a41a   geerlingguy/docker-debian11-ansible:latest   "/lib/systemd/systemd"   39 seconds ago   Up 38 seconds             jovial_wilson
$ docker exec --tty 0c103204a41a /bin/systemctl status
Failed to connect to bus: No such file or directory

Test with own dilettantic build

$ cat Dockerfile 

FROM debian:bullseye

ENV container docker
ENV LC_ALL C
ENV DEBIAN_FRONTEND noninteractive

RUN apt-get update \
    && apt-get install -y python3 sudo bash ca-certificates iproute2 python3-apt aptitude systemd systemd-sysv \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN rm -f /lib/systemd/system/multi-user.target.wants/* \
    /etc/systemd/system/*.wants/* \
    /lib/systemd/system/local-fs.target.wants/* \
    /lib/systemd/system/sockets.target.wants/*udev* \
    /lib/systemd/system/sockets.target.wants/*initctl* \
    /lib/systemd/system/sysinit.target.wants/systemd-tmpfiles-setup* \
    /lib/systemd/system/systemd-update-utmp*

RUN systemctl set-default multi-user.target

#VOLUME [ "/sys/fs/cgroup" ]

CMD [ "/lib/systemd/systemd", "log-level=info", "unit=sysinit.target" ]


$ docker build .
Sending build context to Docker daemon  3.072kB
Step 1/8 : FROM debian:bullseye
...
Successfully built eb8ff56c63ab

$ docker tag eb8ff56c63ab test-deb11-systemd

$ docker  run --detach --privileged  --name test-deb11-systemd test-deb11-systemd
7b0afaa24585c10a5ddcab18c0b1d06aef23501282dc0e8918e505784862a2a8

$ docker exec --tty 7b0afaa24585c10a5ddcab18c0b1d06aef23501282dc0e8918e505784862a2a8 /bin/systemctl status
* 7b0afaa24585
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Fri 2022-01-21 21:50:32 UTC; 6s ago
   CGroup: /
           |-init.scope 
           | |- 1 /lib/systemd/systemd log-level=info unit=sysinit.target
           | |-35 /bin/systemctl status
           | `-42 (pager)
           `-system.slice 
             `-systemd-journald.service 
               `-26 /lib/systemd/systemd-journald

Distro and Packages:

Distributor ID:	Debian
Description:	Debian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye


Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                      Version                 Architecture Description
+++-=========================-=======================-============-=====================================================
ii  docker                    1.5-2                   all          transitional package
ii  docker.io                 20.10.5+dfsg1-1+deb11u1 amd64        Linux container runtime
ii  python3-docker            4.1.0-1.2               all          Python 3 wrapper to access docker.io's control socket

check-config

$ /usr/share/docker.io/contrib/check-config.sh
warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-5.10.0-10-amd64 ...

Generally Necessary:
- cgroup hierarchy: cgroupv2
- apparmor: enabled, but apparmor_parser missing
    (use "apt-get install apparmor" to fix this)
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
- CONFIG_NETFILTER_XT_MARK: enabled (as module)
- CONFIG_IP_NF_NAT: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_POSIX_MQUEUE: enabled

Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: enabled
- CONFIG_MEMCG_SWAP: enabled
    (cgroup swap accounting is currently enabled)
- CONFIG_LEGACY_VSYSCALL_NONE: enabled
    (containers using eglibc <= 2.13 will not work. Switch to
     "CONFIG_VSYSCALL_[NATIVE|EMULATE]" or use "vsyscall=[native|emulate]"
     on kernel command line. Note that this will disable ASLR for the,
     VDSO which may assist in exploiting security vulnerabilities.)
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled (as module)
- CONFIG_CGROUP_NET_PRIO: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: missing
- CONFIG_IP_NF_TARGET_REDIRECT: enabled (as module)
- CONFIG_IP_VS: enabled (as module)
- CONFIG_IP_VS_NFCT: enabled
- CONFIG_IP_VS_PROTO_TCP: enabled
- CONFIG_IP_VS_PROTO_UDP: enabled
- CONFIG_IP_VS_RR: enabled (as module)
- CONFIG_EXT4_FS: enabled (as module)
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
  - "overlay":
    - CONFIG_VXLAN: enabled (as module)
    - CONFIG_BRIDGE_VLAN_FILTERING: enabled
      Optional (for encrypted networks):
      - CONFIG_CRYPTO: enabled
      - CONFIG_CRYPTO_AEAD: enabled (as module)
      - CONFIG_CRYPTO_GCM: enabled (as module)
      - CONFIG_CRYPTO_SEQIV: enabled (as module)
      - CONFIG_CRYPTO_GHASH: enabled (as module)
      - CONFIG_XFRM: enabled
      - CONFIG_XFRM_USER: enabled (as module)
      - CONFIG_XFRM_ALGO: enabled (as module)
      - CONFIG_INET_ESP: enabled (as module)
  - "ipvlan":
    - CONFIG_IPVLAN: enabled (as module)
  - "macvlan":
    - CONFIG_MACVLAN: enabled (as module)
    - CONFIG_DUMMY: enabled (as module)
  - "ftp,tftp client in container":
    - CONFIG_NF_NAT_FTP: enabled (as module)
    - CONFIG_NF_CONNTRACK_FTP: enabled (as module)
    - CONFIG_NF_NAT_TFTP: enabled (as module)
    - CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
- Storage Drivers:
  - "aufs":
    - CONFIG_AUFS_FS: missing
  - "btrfs":
    - CONFIG_BTRFS_FS: enabled (as module)
    - CONFIG_BTRFS_FS_POSIX_ACL: enabled
  - "devicemapper":
    - CONFIG_BLK_DEV_DM: enabled (as module)
    - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
  - "overlay":
    - CONFIG_OVERLAY_FS: enabled (as module)
  - "zfs":
    - /dev/zfs: missing
    - zfs command: missing
    - zpool command: missing

Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000
@geerlingguy
Copy link
Owner

Can you confirm your molecule config looks something like the following? https://github.com/geerlingguy/ansible-role-apache/blob/master/molecule/default/molecule.yml#L7-L12

@zerwes
Copy link
Author

zerwes commented Jan 22, 2022

Yes. Here the relevant part from a failing example:

  - name: keepalived-bionic
    pre_build_image: yes
    image: geerlingguy/docker-ubuntu1804-ansible:latest
    privileged: true
    command: /lib/systemd/systemd
    volumes:
    - /sys/fs/cgroup:/sys/fs/cgroup:ro

and a tasks that enables a service via systemd fails with:

"stderr_lines": ["Failed to connect to bus: No such file or directory"]

@geerlingguy
Copy link
Owner

@zerwes - Can you try changing the command to match what I have set up in mine?

@zerwes
Copy link
Author

zerwes commented Jan 22, 2022

Hello @geerlingguy
Unfortunately makes no difference:

@@ -71,6 +71,6 @@ platforms:
     pre_build_image: yes
     image: geerlingguy/docker-ubuntu2004-ansible:latest
     privileged: true
-    command: /lib/systemd/systemd
+    command: ${MOLECULE_DOCKER_COMMAND:-""}
     volumes:
     - /sys/fs/cgroup:/sys/fs/cgroup:ro

but on the invocation of systemctl: "rc": 1, "stderr": "Failed to connect to bus: No such file or directory"

@evrardjp
Copy link

/me watches this :)

@stefanDeveloper
Copy link

I run into the same problem as @zerwes

@zerwes
Copy link
Author

zerwes commented Mar 9, 2022

@stefanDeveloper is something like the docker file mentioned in the description of the issue or like https://github.com/Rosa-Luxemburgstiftung-Berlin/ansible-role-unbound/blob/main/molecule/default/Dockerfile-debian-bullseye.j2 working for you?

@stefanDeveloper
Copy link

@zerwes you saved my week, thanks that works like a charm!

@zerwes
Copy link
Author

zerwes commented Mar 9, 2022

@stefanDeveloper glad to hear it helped.
and maybe it helps @geerlingguy better to drill down the problem ...

@widhalmt
Copy link

widhalmt commented Mar 24, 2022

I got a similar problem like @zerwes (Hi, by the way :-) ) in NETWAYS/ansible-role-elasticsearch#53 .

As another change that might have an influence I had to remove the following lines because it made starting Elasticsearch in the containers impossible on CentOS:

     volumes:
     - /sys/fs/cgroup:/sys/fs/cgroup:ro

Since I removed that, CentOS tests succeed but Debian ones fail. I put some debugging code into my roles to put out what's wrong. What I'm seeing is:

  fatal: [elasticsearch-cluster2]: FAILED! => {"changed": false, "cmd": "/bin/systemctl", "msg": "Failed to connect to bus: No such file or directory", "rc": 1, "stderr": "Failed to connect to bus: No such file or directory\n", "stderr_lines": ["Failed to connect to bus: No such file or directory"], "stdout": "", "stdout_lines": []}

I suspect, both containers are built differently and what fixes problems for one breaks it for the other?

@zerwes
Copy link
Author

zerwes commented Mar 24, 2022

Hello @widhalmt, is something like the docker file mentioned in the description of the issue or like https://github.com/Rosa-Luxemburgstiftung-Berlin/ansible-role-unbound/blob/main/molecule/default/Dockerfile-debian-bullseye.j2 working for you?

@widhalmt
Copy link

@zerwes So you mean, disabling mounting cgroups? As far as I understood the information from https://discuss.elastic.co/t/error-when-running-7-12-1-on-centos-7-in-docker/271508 the problem was cgroups being mounted in two parts of the test. Looks like you disabled it in the Docker file, I disabled it in molecule.yml. My approach did work with CentOS but not with Debian.

I use several containers by @geerlingguy so I can't easily exchange the container I'm using. I could give it a try to exchange it temporarily, though.

@widhalmt
Copy link

widhalmt commented Mar 24, 2022

I'm seeing the same effect with Rocky Linux 8 now, too. After removing the mount for cgroups in molecule.yml CentOS 7 works again but Debian 10, Debian 11 and Rocky Linux 8 fail.

@zerwes
Copy link
Author

zerwes commented Mar 25, 2022

I use several containers by @geerlingguy so I can't easily exchange the container I'm using. I could give it a try to exchange it temporarily, though.

My intention is surely not to replace the widely< used docker images (therefore my docker foo is much to weak, as I consider myself in this topic just a average user), I just wanted to give @geerlingguy a hint and some help what works and what not ...

@geerlingguy
Copy link
Owner

geerlingguy commented Mar 25, 2022

What's weird is I'm using the same containers on a ton of my projects and not (seemingly) running into the same issues that are mentioned here.

(Edit: Though I'm running them either from mac OS, or from ubuntu...)

@widhalmt
Copy link

I use several containers by @geerlingguy so I can't easily exchange the container I'm using. I could give it a try to exchange it temporarily, though.

My intention is surely not to replace the widely< used docker images (therefore my docker foo is much to weak, as I consider myself in this topic just a average user), I just wanted to give @geerlingguy a hint and some help what works and what not ...

Sorry, that was just me being unclear in my reply. I understood that you did only suppose that for tests and not to replace them completely. What I forgot to mention is, that I'm using them in a matrix check with different OS'es and I can't easily replace a single one, because it wouldn't even start. I need time to change the whole CI configuration to use the container in a test.

@widhalmt
Copy link

widhalmt commented Mar 25, 2022

@geerlingguy I really don't get it either. I see the problems mostly when running them and start Elasticsearch in GitHub actions. For now it works flawlessly with CentOS 7 (when I remove mounting the cgroups in molecule.yml. But it breaks in Rocky Linux 8, Debian 10 and Debian 11.

@tbumke
Copy link

tbumke commented Mar 25, 2022

I get a very similar error with failure 1 during daemon-reload: Failed to get D-Bus connection: No such file or directory but only running molecule tests locally on mac OS. In GitHub actions the same configuration works with CentOS 7 and Rocky Linux 8. I first thought this had something to do with the docker implementation on mac OS (docker desktop vs. native docker runtime). But I'm not that sure anymore.

@geerlingguy
Copy link
Owner

@tbumke - On macOS, that has to do with the implementation of cgroups v2 in Docker for Mac. I believe there's a way to work around it...

@Paul-Weisser
Copy link

Paul-Weisser commented Mar 25, 2022

@widhalmt @zerwes apologies if I have overlooked this but which host system are you using? I ran into the same issues and decided to give up on this matter, just watching this issue.

I am trying to run this in a WSL2 on either Windows 10 or 11 resulting in Debian based containers not starting with systemd or not starting at all. Concerning this all that I have found online is that for some reason WSL2 seems to be incompatible to handle this virtualization.

If it's a Windows-Virtualization issue it would explain why it works fine on (most) MACs and Jeff's Ubuntu

@zerwes
Copy link
Author

zerwes commented Mar 25, 2022

@Paul-Weisser my first touch with this was running a debian 11 container on debian 11 ...

@tbumke
Copy link

tbumke commented Mar 26, 2022

@tbumke - On macOS, that has to do with the implementation of cgroups v2 in Docker for Mac. I believe there's a way to work around it...

Thanks @geerlingguy , this pointed me in the right direction. Searching for cgroups v2 and Docker for Mac, I found this issue docker/for-mac#6073 which also describes a workaround.

Configuring "deprecatedCgroupv1": true (note the missing "s") in ~/Library/Group\ Containers/group.com.docker/settings.json tells Docker for Mac to use legacy cgroups v1. This of course is only a temporary fix until Ansible Molecule supports the cgroupns Docker parameter.

Running the container as follows and with cgroups v2 now also works in my setup:

docker run -it --privileged --cgroupns=host -v /sys/fs/cgroup:/sys/fs/cgroup:rw \
  --name instance -d geerlingguy/docker-debian11-ansible

Note also, that the sysfs volume permissions need to be changed to rw as well. Then I can successfully run systemd services and commands from the container.

@widhalmt
Copy link

widhalmt commented Apr 1, 2022

Thanks @tbumke !!

Changing

    volumes:
    - /sys/fs/cgroup:/sys/fs/cgroup:ro

to

    volumes:
    - /sys/fs/cgroup:/sys/fs/cgroup:rw

did the trick!

Now' I only have to find a way to get around a bug Elasticsearch ( elastic/elasticsearch#74158 ) that keeps instances on multiple instances from starting because the Java Option parser print to stdout insttead of a file. But that only hits when I fire up several containers in a single test and won't keep me from proceeding with the other roles. Thank you everyone, that kept me in a constant state of rage for weeks now. :-)

@widhalmt
Copy link

widhalmt commented Apr 1, 2022

Ok, guess now I'm completely lost. Now it works sometimes and sometimes it doesn't. I'll have to take a deeper look, sorry.

@staticdev
Copy link

staticdev commented Apr 9, 2022

+1 have the same running from debian11. I believe since this image mounts cgroups into the image as a volume, it will have different results if you have different versions of cgroups in your host system. Should it work only on cgroupsv1?

@barrelful
Copy link

also have this, anything I can provide of information to get this fixed @geerlingguy?

@geerlingguy
Copy link
Owner

geerlingguy commented Apr 10, 2022

As I've said before, I haven't had any issues running this with systemd (for example, see my Docker role: https://github.com/geerlingguy/ansible-role-docker/blob/master/.github/workflows/ci.yml#L48 / https://github.com/geerlingguy/ansible-role-docker/runs/5959693637?check_suite_focus=true)

If someone can get a reproducible fault that works with the base image and the same kind of setup I'm using, that would be helpful.

(Another note: it seems cgroups v2 might be the main culprit for some people...)

@mhdan
Copy link

mhdan commented Jun 20, 2022

I faced a problem similar to this issue but with little difference. I have systemd running in the container and have no problem with it but the loginctl fail with this output:

root@test-debian:/# loginctl
Failed to create bus connection: No such file or directory

my host OS is ubuntu20.04.

@dataoscar
Copy link

dataoscar commented Jul 2, 2022

I faced a similar issue in MacOS with an M1 Mac. My work around was to add the following setting in the Docker Engine configuration:

"default-cgroupns-mode": "host"

I then had to change the bind mount to be rw instead of ro:

    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:rw

Edit: Thanks @tbumke for the hints that led to getting this working.

@staticdev
Copy link

@dataoscar but we can't configure Docker Engine of GithubActions to run molecule, or can we?

@tbumke
Copy link

tbumke commented Jul 4, 2022

I've never had any issues with whatever docker engine GitHub Actions uses and Molecule for Ansible using @geerlingguy's Molecule config as a template. I.e., with the platforms block looking like this,

platforms:
  - name: instance
    image: "geerlingguy/docker-${MOLECULE_DISTRO:-centos7}-ansible:latest"
    command: ${MOLECULE_DOCKER_COMMAND:-""}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true

In my setup, cgroups only caused issues on Docker Desktop for Mac.

jkirk added a commit to jkirk/ansible-role-checkmkagent that referenced this issue Jul 7, 2022
Fixed:

* MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]
* MD034/no-bare-urls Bare URL used [Context: "https://synpro.solutions"]
* SC1091 (info): Not following: ./megacli.cfg: openBinaryFile: does not exist (No such file or directory)
* SC2086 (info): Double quote to prevent globbing and word splitting.
* SC2236 (style): Use -n instead of ! -z.

Removed ansible molecule because I was not able to install checkmkagent
in the docker enviroment. Mainly because the docker image has no systemd
enabled, see: geerlingguy/docker-debian11-ansible#4

Also, there used to be public Checkmk demo instance: https://demo.checkmk.com/demo
But it is not reachable anymore, so I have no way downloading the
checkmkagent. Decided to not put any more efforts into it.
@aussielunix
Copy link

aussielunix commented Aug 21, 2022

I have a Fedora36 host and an Ubuntu 22.04 host and I get the same issue testing with https://github.com/geerlingguy/molecule-playbook-testing

The container is not actually running systemd completely.

MOLECULE_DISTRO=debian11 molecule converge
...
...
MOLECULE_DISTRO=debian11 molecule login
root@instance:/# ps faxwww
    PID TTY      STAT   TIME COMMAND
   1421 pts/0    Ss     0:00 bash
   1429 pts/0    R+     0:00  \_ ps faxwww
      1 ?        Ss     0:00 /lib/systemd/systemd
root@instance:/# systemctl status 
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down

This change/diff, taken from what @zerwes post at the top of this issue, is what fixes it for me.

# Dockerfile
...
...
-COPY initctl_faker .
-RUN chmod +x initctl_faker && rm -fr /sbin/initctl && ln -s /initctl_faker /sbin/initctl
 
 # Install Ansible inventory file.
 RUN mkdir -p /etc/ansible
 RUN echo "[local]\nlocalhost ansible_connection=local" > /etc/ansible/hosts
 
+RUN systemctl set-default multi-user.target
 
-VOLUME ["/sys/fs/cgroup"]
 CMD ["/lib/systemd/systemd"]

and

# molecule/default/molecule.yml
...
...
   ansible-lint
 platforms:
   - name: instance
     image: geerlingguy/docker-${MOLECULE_DISTRO:-centos8}-ansible:latest
     command: ""
-    volumes:
-      - /sys/fs/cgroup:/sys/fs/cgroup:ro
     privileged: true
     pre_build_image: true
 provisioner:
   name: ansible

Rerunning with these changes results in

MOLECULE_DISTRO=debian11 molecule converge
...
...
MOLECULE_DISTRO=debian11 molecule login
root@instance:/# ps -faxwww 
    PID TTY      STAT   TIME COMMAND
   1796 pts/0    Ss     0:00 bash
   1808 pts/0    R+     0:00  \_ ps -faxwww
      1 ?        Ss     0:00 /lib/systemd/systemd
     25 ?        Ss     0:00 /lib/systemd/systemd-journald
   1719 ?        Ss     0:00 /usr/sbin/apache2 -k start
   1720 ?        Sl     0:00  \_ /usr/sbin/apache2 -k start
   1721 ?        Sl     0:00  \_ /usr/sbin/apache2 -k start
root@instance:/# systemctl status
● instance
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Sun 2022-08-21 06:10:29 UTC; 9min ago
   CGroup: /
           ├─init.scope 
           │ ├─   1 /lib/systemd/systemd
           │ ├─1796 bash
           │ ├─1809 systemctl status
           │ └─1810 (pager)
           └─system.slice 
             ├─apache2.service 
             │ ├─1719 /usr/sbin/apache2 -k start
             │ ├─1720 /usr/sbin/apache2 -k start
             │ └─1721 /usr/sbin/apache2 -k start
             └─systemd-journald.service 
               └─25 /lib/systemd/systemd-journald

@lapin-b
Copy link

lapin-b commented Aug 23, 2022

@aussielunix's solution works on my side. I've cloned cloned the repository, applied the edits that have been made and rebuilt a docker image I was able to use in molecule.

The service I intend to test is started correctly and molecule converge exits with status zero.

Installation information:

  • Fedora 35 with (probably) cgroup v2 (no kernel parameter related to that)
  • Vanilla Docker installation 20.10.17
  • Ansible 2.9.27 (from dnf)
  • Molecule 3.5.2 (from dnf)

@aussielunix
Copy link

aussielunix commented Aug 24, 2022

I have learnt some more since I posted above.

This, from Lennart Poettering, says bind mounting /sys/fs/cgroup hierarchy is never going to work if cgroup namespaces are used.

Notes:

This is my new molecule.yml.

---

dependency:
  name: galaxy
driver:
  name: docker
lint: |
  set -e
  ansible-lint
platforms:
  - name: instance
    image: "registry.gitlab.com/aussielunix/ansible/molecule-containers/${MOLECULE_DISTRO:-debian:bullseye}"
    privileged: true
    pre_build_image: true
    override_command: false
    tmpfs:
      - /run
      - /tmp
provisioner:
  name: ansible
  log: ${MOLECULE_ANSIBLE_LOG:-true}
  env:
    ANSIBLE_VERBOSITY: ${MOLECULE_ANSIBLE_VERBOSITY:-0}
verifier:
  name: ansible

Examples of using these:

# test with default debian:bullseye
molecule test

# test with default debian:bullseye but silence Ansible logs
MOLECULE_ANSIBLE_LOG=false molecule test

# add -vv to ansible and test with default debian:bullseye
MOLECULE_ANSIBLE_VERBOSITY=2 molecule test

# test with ubuntu:jammy
MOLECULE_DISTRO="ubuntu:jammy" molecule test

# add -vvv to ansible and test with rockylinux:9
MOLECULE_ANSIBLE_VERBOSITY=3 MOLECULE_DISTRO="rockylinux:9" molecule test

@jkirk
Copy link

jkirk commented Sep 25, 2022

@aussielunix Thx for the investigation! The Debian/buster container works fine, but it seems like Debian/bullseye is missing in the gitlab registry: https://gitlab.com/aussielunix/ansible/molecule-containers/container_registry/3343441

jkirk added a commit to jkirk/ansible-role-base that referenced this issue Sep 25, 2022
Using geerlingguy/docker-debian10-ansible +
geerlingguy/docker-debian11-ansible in Ansible molecule currently do not
work with systemd, see: geerlingguy/docker-debian11-ansible#4.

Instead, took Dockfiles from @aussielunix (Thx!) found here (but removed
'Australia/Sydney' timezone):

* https://gitlab.com/aussielunix/ansible/molecule-containers/-/blob/main/debian/buster/Dockerfile
* https://gitlab.com/aussielunix/ansible/molecule-containers/-/blob/main/debian/bullseye/Dockerfile

Compare with @geerlingguy's current Dockerfiles:

* https://github.com/geerlingguy/docker-debian10-ansible/blob/6f6a1650421afc953eb11439db9e5dabcc4d3afe/Dockerfile
* https://github.com/geerlingguy/docker-debian11-ansible/blob/101602c7b9e7b3e100b7435eaa455b94189b2d47/Dockerfile

Note, that when using `dockerfile`, `image` seems to be needed too.
Used `docker.io/debian:$DISTRIBUTION-slim` for `image` as they are the
base images for @aussielunix's Dockerfiles.

I could have used @aussielinux gitlab container registry as `image`, but
currently 'debian:bullseye' is missing:

* https://gitlab.com/aussielunix/ansible/molecule-containers/container_registry/3343441

See: geerlingguy/docker-debian11-ansible#4 (comment)
@aussielunix
Copy link

@jkirk ahh the auto-pruning was set too aggressive.
I have relaxed it and triggered new containers to be built.

@staticdev
Copy link

staticdev commented Oct 26, 2022

I was finally able to verify this issue go away. Ref: ansible/molecule#3632

  • Upgrade to molecule 4.0.3
  • Add cgroupns_mode: host
  • Change /sys/fs/cgroup:/sys/fs/cgroup:ro to /sys/fs/cgroup:/sys/fs/cgroup:rw

Example: ansible/molecule#3665 (comment)

@geerlingguy
Copy link
Owner

Indeed, I just noticed the update, tested it, and wrote this blog post: Docker and systemd, getting rid of dreaded 'Failed to connect to bus' error.

@alecunsolo
Copy link

Hi. cgroupns_mode: host fixes the issue with systemctl, but other commands (like localectl and timedatectl) have the same problem. I get the same behavior with the rocky linux 9 container. Any suggestions are welcome

@artis3n
Copy link
Sponsor

artis3n commented Dec 4, 2022

I believe y'all are getting it working, but molecule v4.0.3 doesn't seem to be enough for me - I am getting an error that cgroupns_mode is not a supported option on community.docker.docker_container (bolding emphasis mine). What am I missing?

TASK [Wait for instance(s) creation to complete] *******************************
failed: [localhost] (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '933904515236.21247', 'results_file': '/home/artis3n/.ansible_async/933904515236.21247', 'changed': True, 'item': {'cgroupns_mode': 'host', 'command': '', 'image': 'geerlingguy/docker-debian11-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:rw']}, 'ansible_loop_var': 'item'}) => {"ansible_job_id": "933904515236.21247", "ansible_loop_var": "item", "attempts": 2, "changed": false, "finished": 1, "item": {"ansible_job_id": "933904515236.21247", "ansible_loop_var": "item", "changed": true, "failed": 0, "finished": 0, "item": {"cgroupns_mode": "host", "command": "", "image": "geerlingguy/docker-debian11-ansible:latest", "name": "instance", "pre_build_image": true, "privileged": true, "volumes": ["/sys/fs/cgroup:/sys/fs/cgroup:rw"]}, "results_file": "/home/artis3n/.ansible_async/933904515236.21247", "started": 1}, "msg": "Unsupported parameters for (community.docker.docker_container) module: cgroupns_mode. Supported parameters include: networks, privileged, read_only, security_opts, image, paused, env, publish_all_ports, cpuset_cpus, hostname, recreate, env_file, container_default_behavior, force_kill (forcekill), oom_killer, init, published_ports (ports), comparisons, cpu_quota, memory_swappiness, timeout, pull, entrypoint, ca_cert (cacert_path, tls_ca_cert), log_driver, kernel_memory, volume_driver, healthcheck, domainname, state, tls, use_ssh_client, labels, volumes, memory, stop_signal, ignore_image, auto_remove, uts, cpu_shares, debug, command, devices, restart_retries, cleanup, interactive, restart_policy, kill_signal, networks_cli_compatible, tty, restart, device_write_bps, output_logs, etc_hosts, docker_host (docker_url), memory_reservation, sysctls, memory_swap, dns_servers, detach, cpus, shm_size, keep_volumes, network_mode, volumes_from, client_cert (cert_path, tls_client_cert), cpu_period, client_key (key_path, tls_client_key), pids_limit, cgroup_parent, cap_drop, storage_opts, device_requests, removal_wait_timeout, command_handling, purge_networks, working_dir, runtime, ssl_version, api_version (docker_api_version), exposed_ports (expose, exposed), mac_address, groups, tls_hostname, validate_certs (tls_verify), links, oom_score_adj, dns_opts, default_host_ip, stop_timeout, device_write_iops, name, device_read_iops, ulimits, ipc_mode, pid_mode, mounts, userns_mode, log_options (log_opt), tmpfs, device_read_bps, capabilities, dns_search_domains, blkio_weight, cpuset_mems, user.", "results_file": "/home/artis3n/.ansible_async/933904515236.21247", "started": 1, "stderr": "/tmp/ansible_community.docker.docker_container_payload_36kjt4c1/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/modules/docker_container.py:1237: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.\n", "stderr_lines": ["/tmp/ansible_community.docker.docker_container_payload_36kjt4c1/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/modules/docker_container.py:1237: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead."], "stdout": "", "stdout_lines": []}

I have the following setup:

molecule.yml

dependency:
  name: galaxy
driver:
  name: docker
platforms:
  - name: instance
    image: ${MOLECULE_DISTRO:-geerlingguy/docker-debian11-ansible:latest}
    command: ${MOLECULE_DOCKER_COMMAND:-"/lib/systemd/systemd"}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    cgroupns_mode: host
    privileged: true
    pre_build_image: true

molecule --version

molecule 4.0.3 using python 3.10 
    ansible:2.14.0
    delegated:4.0.3 from molecule
    docker:2.1.0 from molecule_docker requiring collections: community.docker>=3.0.2 ansible.posix>=1.4.0

Poetry with pyproject.toml file:

[tool.poetry.dependencies]
python = "^3.10"
ansible = "^7.0.0"

[tool.poetry.group.dev.dependencies]
pre-commit = "^2.20.0"
ansible-lint = "^6.8.0"
molecule = {extras = ["docker"], version = "^4.0.3"}

I see cgroupns_mode was added in community.docker 3.0.0 and Molecule is using v3.0.2... https://docs.ansible.com/ansible/latest/collections/community/docker/docker_container_module.html#parameter-cgroupns_mode

@Galaxy102
Copy link

@artis3n in my case it was also necessary to install the current community.docker collection from Ansible Galaxy

@artis3n
Copy link
Sponsor

artis3n commented Dec 7, 2022

Ahhhh yup. ansible-galaxy collection list shows that I'm up to date globally on my system -

# /home/artis3n/.ansible/collections/ansible_collections
Collection        Version
----------------- -------
...
community.general 6.1.0 

but inside my Poetry env, I'm using the older version. Gotta see how to appropriately update..

 /home/artis3n/.cache/pypoetry/virtualenvs/artis3n-tailscale-eXk1DDvX-py3.10/lib/python3.10/site-packages/ansible_collections
Collection                    Version
----------------------------- -------
...
community.general             6.0.1  
...

@artis3n
Copy link
Sponsor

artis3n commented Dec 7, 2022

The dependency step wasn't in my scenario 🙃 Everything's working

@filviu
Copy link

filviu commented Feb 10, 2023

Ok, I'm officially confused and ready to drop the towel in favor of testing directly inside Github Runners.

The fix everybody likes only works for me for docker-debian11-ansible:latest containers:

  - name: instance
    image: "geerlingguy/docker-${MOLECULE_DISTRO:-centos7}-ansible:latest"
    command: ${MOLECULE_DOCKER_COMMAND:-""}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    cgroupns_mode: host
    privileged: true
    pre_build_image: true

It fails inside docker-centos7-ansible:latest with

fatal: [instance]: FAILED! => {"changed": false, "msg": "Service is in unknown state", "status": {}}

On the other hand @aussielunix fix looking like this:

platforms:
  - name: instance
    image: "geerlingguy/docker-${MOLECULE_DISTRO:-centos7}-ansible:latest"
    command: ${MOLECULE_DOCKER_COMMAND:-""}
    override_command: false
    tmpfs:
      - /run
      - /tmp
    cgroupns_mode: host
    privileged: true
    pre_build_image: true

works for docker-centos7-ansible:latest but fails with docker-debian11-ansible:latest with:

Failed to connect to bus: No such file or directory

Running locally on Ubuntu 22.04 latest versions of ansible, molecule, community.general, community.docker.

steinbrueckri added a commit to steinbrueckri/ansible-role-tenable-agent that referenced this issue Mar 8, 2023
steinbrueckri added a commit to steinbrueckri/ansible-role-tenable-agent that referenced this issue Mar 8, 2023
@darsh12
Copy link

darsh12 commented May 15, 2023

Thank to @artis3n
I am running molecule 5.0.1 with ansible 2.14.5. Adding the command: ${MOLECULE_DOCKER_COMMAND:-"/lib/systemd/systemd"} worked for me when using ubuntu2204, together with volumes, privilged mode and cgroupsns_mode

@artis3n
Copy link
Sponsor

artis3n commented May 16, 2023

Yeah, if it is helpful to others here are all the distros I was testing and what command I had to use for everything to work smoothly:

https://github.com/artis3n/ansible-role-tailscale/blob/7e9907a606df08ce79fa675bbf45c59be97a1e9b/.github/workflows/pull_request_target.yml#L24-L44

and https://github.com/artis3n/ansible-role-tailscale/blob/7e9907a606df08ce79fa675bbf45c59be97a1e9b/.github/workflows/pull_request_target.yml#L59-L65

Default if not provided is /usr/sbin/init

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests