Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding update to retry if 0 filtered found #332

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

paigerube14
Copy link
Contributor

@paigerube14 paigerube14 commented Jun 30, 2021

Changing the returned values from the action_nodes_pods.execute function to continue looping until the timeout or count has been hit
Everything should execute the same, except for when we want to run retries, if the filtered list is 0 and we expect a number it should keep attempting to execute

Fixes: #331

Signed-off-by: prubenda <prubenda@redhat.com>
@seeker89
Copy link
Contributor

Hi and thanks so much for the PR!

I was just reading through the code. Could you please give me a use-case for where this is helpful? Many thanks!

@paigerube14
Copy link
Contributor Author

Hi @seeker89,
If we were to ever have a pod that only had a single replica and we used powerfulseal to kill the pod. When the checkPodCount section of powerfulseal runs, it sees there is no pod so it immediately stops the retrying loop. I would want the loop to continue to run until it hits the time limit or number of retries from the yaml file setup.
Specifically on single node openshift there is only one "etcd" pod, so once this pod is killed the retries finds 0 and exits out. Below is an example of the output from a run hitting the issue I'm seeing. Note there is no retrying once 0 pods are found

2021-06-25 19:16:09 INFO __main__ No cloud driver - some functionality disabled
2021-06-25 19:16:09 INFO __main__ Using stdout metrics collector
2021-06-25 19:16:09 INFO __main__ NOT starting the UI server
2021-06-25 19:16:09 INFO __main__ STARTING AUTONOMOUS MODE
2021-06-25 19:16:12 INFO scenario.delete etcd pod Starting scenario 'delete etcd pods' (2 steps)
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Matching 'labels' {'labels': {'namespace': 'etcd', 'selector': 'k8s-app=etcd'}}
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Matched 1 pods for selector k8s-app=etcd in namespace etcd
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Initial set length: 1
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Filtered set length: 1
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Pod killed: [pod #0 name=etcd-master-00.qe-pr-sno2.qe.devcluster.openshift.com namespace=etcd containers=4state=Running labels:app=etcd,etcd=true,k8s-app=etcd,revision=2 annotations:kubernetes.io/config.hash=*,kubernetes.io/config.seen=2021-06-25T14:30:12.819685290Z,kubernetes.io/config.source=file,target.workload.openshift.io/management={"effect": "PreferredDuringScheduling"}]
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Matching 'labels' {'labels': {'namespace': 'etcd', 'selector': 'k8s-app=etcd'}}
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Matched 0 pods for selector k8s-app=etcd in namespace etcd
2021-06-25 19:16:12 INFO action_nodes_pods.delete etcd pod Initial set length: 0
2021-06-25 19:16:12 INFO scenario.delete etcd pod Scenario finished
2021-06-25 19:16:12 INFO policy_runner All done here!

@seeker89
Copy link
Contributor

@paigerube14 thanks, now it makes so much more sense!

@paigerube14
Copy link
Contributor Author

Thanks so much @seeker89

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

checkPodCount ends preemptively when 0 pods remain after pod killing
2 participants