Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNI Genie can fail to call CNI release during a SandboxChanged/FailedCreatePodSandBox restart #214

Open
sleerssen opened this issue Oct 16, 2020 · 2 comments

Comments

@sleerssen
Copy link

AWS EKS: 1.15/1.16
Cilium: 1.8.4
CNI Genie: genie-plugin@sha256:fbd3ad6db001035f270f9a7dc460de5145fc773cca3875ade505fa233a04ea08
genie-policy-controller@sha256:849551bc3ad1d8a74a49f264aad21191e97c0e5fdad08c20d7f7d07d9ea1e4e7

We have a cron job that frequently restarts pod creation, the result of which appears to cause CNI Genie to leak IPs by not calling the underlying CNI to release the IP. It allocates the IP, but during configuration of the pod, finds the container no longer available (from the sandbox recreation) and appears to attempt to release the address from the CNI, but the call to the CNI is never made, so the IP is held in IPAM as used, eventually exhausting IP address space.

From the kubelet log, it shows the IP allocation and the failed attempt to set up the network:

Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie Add IP address
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie workloadID= auth.opa-bundler-1602848580-6zdkj
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie orchestratorID= k8s
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie annot= [map[kubernetes.io/psp:eks.privileged]]
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie Found configuration files in /etc/cni/net.d: [/etc/cni/net.d/00-genie.conf /etc/cni/net.d/05-cilium.conf /etc/cni/net.d/10-aws.conflist]
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie no annotations is given! Using default plugins: [cilium],  annot is map[kubernetes.io/psp:eks.privileged]
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie plugion map: map[cilium:map[false:[1]]]
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie found configuration file (/etc/cni/net.d/05-cilium.conf) for plugin cilium
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie length of finalPluginInfos= 1
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie adding network for plugin element: {PluginName:cilium IfName:eth0 Subnet: Refer_nic: Config:0xc00009cd80 OptionalArgs:map[] ValidationParams:<nil> ValidateRes:<nil>}
Oct 16 11:43:22 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie runtime conf for plugin (cilium): {cf213c61546b92be1b4f767e2ed2cff2ca37c02a4139aa0b87c9fc030924162b /proc/19986/ns/net eth0 [[IgnoreUnknown 1] [K8S_POD_NAMESPACE auth] [K8S_POD_NAME opa-bundler-1602848580-6zdkj] [K8S_POD_INFRA_CONTAINER_ID cf213c61546b92be1b4f767e2ed2cff2ca37c02a4139aa0b87c9fc030924162b]] map[] }
Oct 16 11:43:23 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: W1016 11:43:23.695204    5828 pod_container_deletor.go:75] Container "cf213c61546b92be1b4f767e2ed2cff2ca37c02a4139aa0b87c9fc030924162b" not found in pod's containers
Oct 16 11:43:26 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie addNetwork (cilium) err: <nil>; result: Interfaces:[{Name:eth0 Mac:92:04:78:eb:b4:d8 Sandbox:/proc/19986/ns/net}], IP:[{Version:4 Interface:<nil> Address:{IP:10.136.99.17 Mask:ffffffff} Gateway:10.136.72.69}], Routes:[{Dst:{IP:10.136.72.69 Mask:ffffffff} GW:<nil>} {Dst:{IP:0.0.0.0 Mask:00000000} GW:10.136.72.69}], DNS:{Nameservers:[]Domain: Search:[] Options:[]}
Oct 16 11:43:26 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie End result= Interfaces:[{Name:eth0 Mac:92:04:78:eb:b4:d8 Sandbox:/proc/19986/ns/net}], IP:[{Version:4 Interface:<nil> Address:{IP:10.136.99.17 Mask:ffffffff} Gateway:10.136.72.69}], Routes:[{Dst:{IP:10.136.72.69 Mask:ffffffff} GW:10.136.72.69} {Dst:{IP:0.0.0.0 Mask:00000000} GW:10.136.72.69}], DNS:{Nameservers:[] Domain: Search:[] Options:[]}
Oct 16 11:43:26 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: W1016 11:43:26.444326    5828 docker_sandbox.go:384] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "opa-bundler-1602848580-6zdkj_auth": CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "cf213c61546b92be1b4f767e2ed2cff2ca37c02a4139aa0b87c9fc030924162b"

and then shortly after that, it shows an attempt to release the IP, but the CNI never gets the request:

Oct 16 11:47:19 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie releasing IP address
Oct 16 11:47:19 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie workloadID= auth.opa-bundler-1602848580-6zdkj
Oct 16 11:47:19 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie orchestratorID= k8s
Oct 16 11:47:19 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: CNI Genie no env var and no pod
Oct 16 11:47:20 ip-10-50-83-136.us-west-2.compute.internal kubelet[5828]: Pod annotations not found during pod delete, proceeding to deletepodW1016 11:47:20.090982    5828 cni.go:309] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "22de795bd8cb4ffcac7305f232f31932ffb6a5722eb580bb007f6fad149b759f"
@sleerssen
Copy link
Author

I think I found the source of the leak. It looks like a call to deleteNetwork() is needed here:

fmt.Fprintf(os.Stderr, "CNI Genie error while setting pod status(%v): %v:\n", string(bytes), err)

I've been trying to get this to build locally, but am having issues with

cannot find package "github.com/containernetworking/cni/pkg/types/current"

I guess maybe I need to be on a k8s node for this to build?

@antoniotamer
Copy link

Bump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants