Bug: Ansible stuck #1317

Despire · 2024-04-02T11:23:52Z

Current Behaviour

During a run of the e2e pipeline when installing wireguard via the ansible playbook the process spawned got stuck for unknown reasons. This halted the workflow of the picked up config and resulting in long build times and eventually a failure, and the process that got stuck will be left there indefinitely.

Expected Behaviour

There should be a mechanism for this, although I'm not sure what or how. A timeout will not help here as the larger the cluster is the longer the playbook will need to executed

Steps To Reproduce

I have encountered this at random.

JKBGIT1 · 2024-04-05T12:31:41Z

We are waiting for more occurrences to be able to debug deeper.

JKBGIT1 · 2024-04-24T12:00:51Z

I experienced something similar or maybe it was the same.

In my case, the playbook to install VPN stuck on Check if unattended-upgrades.service is present. I tried to kill all the ansible playbook processes to trigger the retry of that ansible playbook run. However, the processes probably weren't actually killed (I used SIGTERM and also SIGKILL), only their command changed and I still could see them listed when running ps aux (see the image below).

The processes with the command [ansible-playbook] were "killed" by me and the rest were newly spawned by the main container process after I "killed" the old ones.

I kept killing the ansible playbook processes to run out of retries and finish up with a failed workflow. After the workflow failed, I ran the playbook to Install VPN with higher verbosity from ansibler manually. The playbook finished successfully this time. The next time, when the ansibler ran this playbook, it also went well, so I don't know what was going on there before.

Despire added the bug Something isn't working label Apr 2, 2024

JKBGIT1 added the groomed Task that everybody agrees to pass the gatekeeper label Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Ansible stuck #1317

Bug: Ansible stuck #1317

Despire commented Apr 2, 2024

JKBGIT1 commented Apr 5, 2024

JKBGIT1 commented Apr 24, 2024

Bug: Ansible stuck #1317

Bug: Ansible stuck #1317

Comments

Despire commented Apr 2, 2024

Current Behaviour

Expected Behaviour

Steps To Reproduce

JKBGIT1 commented Apr 5, 2024

JKBGIT1 commented Apr 24, 2024