New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong Cilium L2Announce Holder #32148
Comments
I don't think there is a requirement to run L2 announcement lease on the same node as the targeting pod. The routing between L2 annoncement lease and target pod should be handled by kube-proxy (replacement). In our setup, this seems to be working correctly for ingress L2 annoncement. However we're experiencing something similar, i.e. for syslog daemonset, the L2 annoncement lease is working only with the pod on the same node (though hubble says traffic to other pods on other nodes is working fine). Which is strange, because the behavior is different between two daemonset deployments on the same cluster. We think, it might have something to do with #27151, though it is not working not only with externalIPs, but LoadBalancer services as well. |
In my research and experiments on the subject, I noticed something interesting. I'm setting up an Ingress using Kubernetes Ingress cloud yaml below. When I delete and re-expose the ingress-nginx-controller load balancer service created with yaml for testing, it works even though, as you mentioned, the lease and pods are on different nodes. I'm sharing the yaml files for both services. ---original---
---original--- ---exposed---
---exposed--- I don't understand exactly what's happening, but it works. |
I think I finally understood the problem. |
This is my observation too. The real Client IP is visible only on the application pods colocated on the same node as the lease. The other pods get only the IP of the Cilium_host interface of the "lease" node. Which is logical, as the traffic must be transitioned to the overlay network (we're using). The solution to the Client IP might be DSR mode. This is however only available in native-routing mode, not in the encapsulation mode |
We're experiencing this in both DSR-opt and Geneve mode. L2announcement seems to be okay with both local and cluster externaltrafficpolicy, but traffic is only routed to the correct pod if both the L2announcement and pod are on the same node. We are running 1.14.0.
|
Is there an existing issue for this?
What happened?
Hello,
I have a Kubernetes cluster with 1 master and 2 worker nodes.
I have Kubernetes version 1.29.4 and Cilium version 1.15.4 installed. I want to use L2 announcement.
Previously, I was using MetalLB, but I reinstalled it with the following Helm command and YAML files as part of the Cilium migration. I'm experiencing issues with Leases. For example, if the ingress controller is running on worker2, when I look at the leases, I see the holder of cilium-l2announce-ingress-nginx-ingress-nginx-controller as worker2, and I can access my application from the browser. So far, everything is fine. However, when I shut down or drain worker2, the ingress controller starts running on worker1 naturally. But the lease holder still appears as worker2, and the issue persists until I manually delete the lease or delete all Cilium pods. In this case, I cannot access the application. Can you help me with the solution to this problem?
Best regards.
Cilium Version
v1.15.4
Kernel Version
5.15.0-105.125.6.2.2.el9uek.x86_64
Kubernetes Version
v1.29.4
Regression
No response
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Cilium Users Document
Code of Conduct
The text was updated successfully, but these errors were encountered: