Skip to content
This repository has been archived by the owner on Jul 23, 2020. It is now read-only.

Pods with "unknown" state appearing on cluster 2a #4728

Open
ljelinkova opened this issue Jan 23, 2019 · 6 comments
Open

Pods with "unknown" state appearing on cluster 2a #4728

ljelinkova opened this issue Jan 23, 2019 · 6 comments

Comments

@ljelinkova
Copy link
Collaborator

I've noticed that 3 of our accounts on cluster 2a contain jenkins pods with "unknown" state. They are there for more days and remain there even after reset of environment.

pod/booster-mission-runtime-s2i-8-build   0/1       Completed     0          4h
pod/booster-mission-runtime-s2i-9-build   0/1       Completed     0          1h
pod/jenkins-1-84rbx                       0/1       Pending       0          45m
pod/jenkins-1-deploy                      1/1       Running       0          49m
pod/jenkins-1-hkf75                       0/1       Unknown       0          5d
pod/jenkins-1-lfl6c                       0/1       Terminating   0          1h
pod/jenkins-1-tt7xm                       0/1       Unknown       0          1d
pod/jenkins-1-wzq8n                       0/1       Unknown       0          1d

Affected accounts: osio-ci-e2e-001-preview, osio-ci-e2e-002-preview, osio-ci-e2e-007

http://artifacts.ci.centos.org/devtools/e2e/devtools-test-e2e-prod-preview.openshift.io-smoketest-pr-us-east-2a-released/5415/oc-jenkins-logs-before-all.txt
http://artifacts.ci.centos.org/devtools/e2e/devtools-test-e2e-prod-preview.openshift.io-smoketest-pr-us-east-2a-beta/5417/oc-jenkins-logs-before-all.txt
http://artifacts.ci.centos.org/devtools/e2e/devtools-test-e2e-openshift.io-smoketest-us-east-2a-released/1467/oc-jenkins-logs-before-all.txt

@skryzhny
Copy link

Does it block you?

@ljelinkova
Copy link
Collaborator Author

Yes.

@skryzhny
Copy link

OPS cleared pods, I also can't see them.
@ljelinkova can you recheck?

@ljelinkova
Copy link
Collaborator Author

The pods have been deleted, however, the new pod is stuck in the terminating state....

http://artifacts.ci.centos.org/devtools/e2e/devtools-test-e2e-openshift.io-smoketest-us-east-2a-released/1474/oc-jenkins-logs-before-all.txt

@pbergene
Copy link
Collaborator

@JohnStrunk has an understanding of what might be causing this. It seems related to stuck mounts which can either be unmounted manually or go away as the node is rebooted. As by ways of a fix, this would first have to be created and then we'll have to work out the process to get it applied.

@ljelinkova
Copy link
Collaborator Author

I haven't seen new terminating pods in the last week so I will decrease the severity and priority of the issue. However, I'll leave it open since this should be investigated and prevented in the future.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants