-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uninstall/cleanup procedures #1491
Comments
@gageorsburn IIRC to deploy a new version of kubevirt you need to delete all old KubeVirt components and install a new one after it.
I think this flow will exist until we will write some operator.
Good point, we need to document upgrade procedure. |
/cc @rthallisey something for the operator. |
ack. Thanks for the mention. Adding to trello for tracking |
@cynepco3hahue The main thing I wanted to point out is that even after |
@gageorsburn I see, thanks for the information, I forgot that we create some bunch of k8s entities via the code. |
Yes the API server registration is the worst bit.
@davidvossel I actually wonder if we could add a stub api server
registration ot the release yaml, which is then "reused" by the real
registration. If it's in the release yaml, then removal with the yaml would
work.
|
We are reusing it if it already exists, so in theory it might work, but we are creating more objects (secrests for certificates for instance). Also moving kubevirt to a separate namespace than kube-system would not solve the whole issue, since not all things we need to create/use are namespaced. I fear that right now, for a real |
@fabiand but if we move kubevirt to a separate namespace, like you suggested we could at least shorten the needed commands in https://github.com/kubevirt/kubevirt/blob/master/cluster/clean.sh. |
Yes, a dedicated namespace should at least improve. |
right, that cert generation is ultimately what prevents us from doing this. If we created those objects using manifests with predictable cert values that the api-server knew needed to be replaced, then i suppose it's technically possible to add these objects to the manifests. I'm nervous about how fragile that might be though. |
I actually ran into an interesting issue around this just a day ago. When I upgraded from kubernetes 1.11.2 to 1.11.3, suddenly namespaces wouldn't delete, they'd get stuck in 'terminating'. It looks to have been caused by a lingering 'v1alpha1.subresources.kubevirt.io' and kubernetes validating the namespace deletion against all the apiservices. |
@mlsorensen Not sure if you've come across the work around for that yet but you can output the namespace to json, delete the finalizer and apply the namespace json. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with /lifecycle rotten |
With the operator work of @slintes this should be finally addressed.
|
@gageorsburn did you try the new operator install and uninstall way? |
@fabiand Yeah we've had good results with the operator so far. Honestly overall KubeVirt has been a lot more stable than it was last year. |
Just as an FYI to whoever looking to fully uninstall kubevirt and having namespace state as "Terminating" because of kubevirt api resources that are still dangling: Just run the following commands: # the namespace of kubevirt installation
export namespace=kubevirt
export labels=("operator.kubevirt.io" "operator.cdi.kubevirt.io" "kubevirt.io" "cdi.kubevirt.io")
export namespaces=(default ${namespace} "<other-namespaces that has kubevirst resources>")
kubectl -n ${namespace} delete kv kubevirt
kubectl -n ${namespace} patch kv kubevirt --type=json -p '[{ "op": "remove", "path": "/metadata/finalizers" }]'
kubectl get vmis --all-namespaces -o=custom-columns=NAME:.metadata.name,NAMESPACE:.metadata.namespace,FINALIZERS:.metadata.finalizers --no-headers | grep foregroundDeleteVirtualMachine | while read p; do
arr=($p)
name="${arr[0]}"
namespace="${arr[1]}"
kubectl patch vmi $name -n $namespace --type=json -p '[{ "op": "remove", "path": "/metadata/finalizers" }]'
done
for i in ${namespaces[@]}; do
for label in ${labels[@]}; do
kubectl -n ${i} delete deployment -l ${label}
kubectl -n ${i} delete ds -l ${label}
kubectl -n ${i} delete rs -l ${label}
kubectl -n ${i} delete pods -l ${label}
kubectl -n ${i} delete services -l ${label}
kubectl -n ${i} delete pvc -l ${label}
kubectl -n ${i} delete rolebinding -l ${label}
kubectl -n ${i} delete roles -l ${label}
kubectl -n ${i} delete serviceaccounts -l ${label}
kubectl -n ${i} delete configmaps -l ${label}
kubectl -n ${i} delete secrets -l ${label}
kubectl -n ${i} delete jobs -l ${label}
done
done
for label in ${labels[@]}; do
kubectl delete validatingwebhookconfiguration -l ${label}
kubectl delete pv -l ${label}
kubectl delete clusterrolebinding -l ${label}
kubectl delete clusterroles -l ${label}
kubectl delete customresourcedefinitions -l ${label}
kubectl delete scc -l ${label}
kubectl delete apiservices -l ${label}
kubectl get apiservices -l ${label} -o=custom-columns=NAME:.metadata.name,FINALIZERS:.metadata.finalizers --no-headers | grep foregroundDeletion | while read p; do
arr=($p)
name="${arr[0]}"
kubectl -n ${i} patch apiservices $name --type=json -p '[{ "op": "remove", "path": "/metadata/finalizers" }]'
done
done |
@azaiter actually a good hint how to cleanup the mess. All in all it's recommneded to delete the KubeVirt CR which will do the cleanup properly, then delete the namespace. I'm however wondering if we could do some trick to enable the nice cleanup based on namespace somehow ... |
Still the issues is there ,not able to uninstall kubevirt. I tried both the script which is described in above comment and and the step which are given in official docs. |
Yes, it's still an issue. Patches would be welcome to address this properly.
After all it seems to be mostly an ordering problem during uninstallation.
|
A few additions: It is known that deletion/uninstall is not working correctly when a user is trying to do this procedure by removing the namespace It is also known that uninstall via the kubevirt CR is working reliably. Our recommendation is to use the operator for deployment and undeployment, and after undeployment to remove the operator. |
Let me try again:
Note that this time, I didn't get the empty response from Continuing:
In another window:
|
And I suppose the last line of your snippet appears because the webhook
registraion is still there, but the api pod is gone (thus webhook pointing
into the void).
Check the webhook registrations in the other window.
|
Should I do |
|
Thanks.
In my understanding we'd at best (as a workaround)
- delete the CR: kubectl delete -n kubevirt kubevirt kubevirt
- delete the two webhooks
- delete the namespace: kubectl delete namespace kubevirt
In future, as Roman noted, we can leverage finalizers to erase the webhooks.
|
So this gest confusing now. Let me clarify: You can do
or
Then wait until
returns Then only |
@fabiand Webhooks and APIServices are not namespaced (so should not block a namespace delete) and we also don't have a finalizer on them (so nothing should block their deletion, even if they were in the namespace). |
Good point.
As noted on the other thread:
Now I suppose that we should delete webhook and apiserver registration
before deleting the cr, to avoid that both of them outlive the api
server, as this would lead to both of them to fail to operate. And
especially the webhook to me is the reason for the hanging deletion.
Thus
- delete registrations (webhook+apiserver)
- delete cr
- delete ns
|
Ok, just checked the
@candlerb that is probably what you meant. Running
works. Seems like all apiservices need to be online and available, or deleting namespaces is blocked. Independent if resources are present or not. |
That is wrong. It is the APIService, see above. |
Testing this in this new commit at kubevirt/demo#131
|
Here the right uninstall procedure until we have a fix:
Sorry for the inconvenience. |
I confirm this uninstall procedure works, thank you!
|
I'm reopening this bug because the apiserver removal still needs to be done manually. /reopen |
@fabiand: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
👍 |
The uninstall procedure is now documented. Hope to have the code fix which makes this obsolete merged soon. |
I am having the same problem, installed cdi and deleting a random namespace now gets stuck. |
The working sequence to uninstall kubevirt is now documented here: |
https://kubevirt.io/user-guide/#/installation/updating-and-deleting-installs?id=deleting-kubevirt Following this document. Managed to remove the kubevirt namespace, validatingwebhookconfigurations and some apiservices. However, when I delete these 4 apiservices. they just keep coming back. Anyone has any idea why? $ kubectl delete apiservices v1alpha1.cdi.kubevirt.io v1alpha1.snapshot.kubevirt.io v1alpha3.kubevirt.io v1beta1.cdi.kubevirt.io Update: |
Not really answering your question: But are you using the operator? Because
then the uninstall flow is much simpler
|
In the Get Started document, we always started to create the CR and Operator kubectl create -f _out/manifests/release/kubevirt-operator.yaml
kubectl create -f _out/manifests/release/kubevirt-cr.yaml But let's say that I've done the above in my k8s cluster, and am now ready redeploy a newly updated operator and CR. My questions are:
|
I was playing with kubevirt again, and I managed to break a k3s cluster by applying the kubevirt resources, creating a couple of VMs, and then removing kubevirt with In this state, even
Manually deleting resources where I can:
Still broken, and
Now I can list api-resources, which lets me use another clue here:
At least I now know which resource is preventing the namespace from being deleted. However, I still get this error trying to delete it:
I was able to forcibly delete namespace this way:
... but actually I shouldn't have done that, as there's still that kubevirt resource hanging around:
(Digs further) maybe it was the test VMs I had created? The pods are long gone though.
There's definitely no VM pod (
However, the final clue was here:
Then:
I think that's clean enough now. Phew :-) |
Still not able to delete namespace need to run this
|
it will reinstall again
|
I don't have a lot of logs backing this because my lab cluster has been in a lot of flux lately but I was running into an issue where the kube-controller-manager pods were crashing whenever a newer version of kubevirt was installed which would cause complete unstability within the cluster.
I had no idea where those were coming from until today when I looked at the apiservice objects.
I think older versions are causing issues. Kind of a weird issue/hard to word. It might be good to start some documentation keeping track of all of the resources that need to be cleaned up across version upgrades, uninstalls, etc.
The text was updated successfully, but these errors were encountered: