Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan pods stuck in pending, detached from ComplianceScan #447

Open
montaguethomas opened this issue Oct 17, 2023 · 0 comments
Open

Scan pods stuck in pending, detached from ComplianceScan #447

montaguethomas opened this issue Oct 17, 2023 · 0 comments

Comments

@montaguethomas
Copy link

montaguethomas commented Oct 17, 2023

OpenShift 4.12.33
compliance-operator.v0.1.61

It's possible that a perfectly timed removal of a Node from the cluster can result in leaving an openscap-pod stuck in pending forever.

If a Node removal is triggered as a new ComplianceScan is being triggered, it is highly likely that the scan pod will never be allowed to complete a scan. Even if the scan pod does get scheduled to the node, but is immediately drained as the node is being deleted, there won't be any results from the scan pod. Seemingly, CO will recreate the pod, for the now removed node, and that pod will be stuck in pending forever.

Then on subsequent scheduled scans, the node list won't have the removed node and CO won't remove/cleanup the scan pod stuck pending due to how it delete scan pods.

The scan pods do not define an OwnerRef to the ComplianceScan object, so deleting the ComplianceScan has no effect in cleaning up the pending pod either.

Noting that we enable timeout retries on scans and do have debug enabled. Looking at the ComplianceScan handler, it could be that due to having debug: true set, it's skipping the cleanup logic with the node list created at the scan start time, thus missing the opportunity to cleanup the stuck pending scan pod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant