Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeleteOperation polling is not reliable #337

Open
shvgn opened this issue Nov 30, 2021 · 0 comments
Open

DeleteOperation polling is not reliable #337

shvgn opened this issue Nov 30, 2021 · 0 comments
Milestone

Comments

@shvgn
Copy link
Contributor

shvgn commented Nov 30, 2021

In the patch collector, when using the delete operation, we have a poll that ensures that a resource has been deleted. It might fail if a pod is recreated within a second with the same name.

err = wait.Poll(time.Second, 20*time.Second, func() (done bool, err error) {

Expected behavior (what you expected to happen):

When pods are deleted and re-created with the same name via patch collector, a hook should not return timeout error (false positive)

Actual behavior (what actually happened):

Deleting a pod with non-changing name (owned by a statefulset) should not result in timeout error.

Steps to reproduce:

  1. Apply a delete operation for a statefulset pod which starts quickly enough to rollout in 1 second
  2. Shell operator reports timeout error

Environment:

Deckhouse smoke-mini resheduler hook

Logs
Pod "smoke-mini-d-0" marked for deletion
Module hook failed, requeue task to retry after delay. Failed count is 1. Error: 1 error occurred:
	* Delete object v1/Pod/d8-upmeter/smoke-mini-d-0: timed out waiting for the condition
@diafour diafour added this to the 1.1.0 milestone May 11, 2022
@diafour diafour modified the milestones: 1.1.0, future Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants