-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] eksctl delete cluster leakes network interface, subnet and vpc #7589
Comments
Hello mauriciovasquezbernal 👋 Thank you for opening an issue in |
Hi @mauriciovasquezbernal I believe the error isn't surfaced because you set the |
I am also having this issue when attempting to delete clusters. We use the |
Hi @yuxiang-zhang! Can you please share more context on eksctl/pkg/ctl/delete/cluster.go Line 45 in 025550a
Best regards. |
Hi @eiffel-fl, we'll need more details on how you configure the cluster and what does your tests do to the VPC/subnets. We also create clusters to run tests and tear down afterwards, but we haven't seen this issue occur. If you set
|
Hi!
Sure! Particularly, we are not creating ourselves VPC and they are created by the
I would like to avoid using If you need any other information, please let me know. Best regards. |
Hi @eiffel-fl - thanks for explaining your workflow!
Setting We're trying to determine the underlying problem, hence why we are looking for some details as in - what actually happens inside those integration tests? I understand there may be a lot of things going on, but we should try to make some guesses as to what can influence
Simply running
|
Hi! I appreciate your reply 😄!
OK, this makes sense, thank you for shedding some light.
Basically, we are deploying Inspektor Gadget to the cluster, and then run our integration tests.
I did not dive into which subnet fails and which one succeeds to be deleted.
We only call
So, unless some If you have ideas of what I can check, please share. Best regards. |
We don't create any subnet or anything related to the networking stack of the clusters during the integration tests. We only deploy Inspektor Gadget (there is nothing specially about it that could affect the cluster networking) and some workloads to generate events (network traffic, dns requests, opening files, executing process, etc). I'll try to create a reproducer without Inspektor Gadget |
In our logs we see the following (X's added by me):
This is the error we see in the events of the CloudFormation stack. Since it can't delete this subnet (but deleted other subnets succesfully), it lets the VPC alive and the CloudFormation stack stays in the |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
I haven't have the chance to test it more. We implemented a work around (inspektor-gadget/inspektor-gadget#2686) to clean the leaked resources. |
What were you trying to accomplish?
We're using ekscli on as part of the CI system of Inspektor Gadget. For each CI run, we need to create a cluster and destroy it after running our tests.
What happened?
After some days, it's not possible to create new clusters:
This is happening because the deletion of the cluster is failing some times leaking some resources. The
eksctl delete cluster
logs don't have any relevant information:But the logs from cloud formation indicate a subnet couldn't be deleted:
The subnet can't be deleted because it has a network interface attached:
I can manually remove the network interface and then the CloudFormation stack.
This is something that happens often, after one week or so our limit of 20VPCs is reached:
How to reproduce it?
I suppose trying to create and remove a cluster multiple times will reproduce this behavior.
Logs
Anything else we need to know?
eksctl downloaded from latest release from this repository.
Versions
I don't have access to this eksctl instance as it was running on GitHub Actions, but the version reported was 0.171.0.
The text was updated successfully, but these errors were encountered: