You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During a run of the e2e pipeline when installing wireguard via the ansible playbook the process spawned got stuck for unknown reasons. This halted the workflow of the picked up config and resulting in long build times and eventually a failure, and the process that got stuck will be left there indefinitely.
Expected Behaviour
There should be a mechanism for this, although I'm not sure what or how. A timeout will not help here as the larger the cluster is the longer the playbook will need to executed
Steps To Reproduce
I have encountered this at random.
The text was updated successfully, but these errors were encountered:
I experienced something similar or maybe it was the same.
In my case, the playbook to install VPN stuck on Check if unattended-upgrades.service is present. I tried to kill all the ansible playbook processes to trigger the retry of that ansible playbook run. However, the processes probably weren't actually killed (I used SIGTERM and also SIGKILL), only their command changed and I still could see them listed when running ps aux (see the image below).
The processes with the command [ansible-playbook] were "killed" by me and the rest were newly spawned by the main container process after I "killed" the old ones.
I kept killing the ansible playbook processes to run out of retries and finish up with a failed workflow. After the workflow failed, I ran the playbook to Install VPN with higher verbosity from ansibler manually. The playbook finished successfully this time. The next time, when the ansibler ran this playbook, it also went well, so I don't know what was going on there before.
Current Behaviour
During a run of the e2e pipeline when installing wireguard via the ansible playbook the process spawned got stuck for unknown reasons. This halted the workflow of the picked up config and resulting in long build times and eventually a failure, and the process that got stuck will be left there indefinitely.
Expected Behaviour
There should be a mechanism for this, although I'm not sure what or how. A timeout will not help here as the larger the cluster is the longer the playbook will need to executed
Steps To Reproduce
I have encountered this at random.
The text was updated successfully, but these errors were encountered: