Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s operator: tailscale ingress sometimes tries to connect to 127.0.0.1 instead of ClusterIP, fails with "netstack: could not connect to local server at ..." #12079

Open
garymm opened this issue May 9, 2024 · 2 comments

Comments

@garymm
Copy link

garymm commented May 9, 2024

What is the issue?

I'm really not sure how to reproduce this but I've seen this a couple of times. Restarting the tailscale ingress pods seems to fix it.

I set up two services (docker-registry, and headlamp) with type ClusterIP both listening on port 80.
Both services have a tailscale ingress.
This is a test cluster with only one node, so everything is on the same node.

When trying to connect, I see errors like this in the tailscale pod:

2024/05/09 22:55:03 Accept: TCP{100.115.199.49:52551 > 100.77.184.113:80} 64 tcp ok
2024/05/09 22:55:03 netstack: could not connect to local server at 127.0.0.1:80: dial tcp 127.0.0.1:80: connect: connection refused

Restarting the tailscale ingress pod seems to fix the issue.

I'm not a kubernetes expert, but it seems supicious that tailscale is trying to connect to the service on 127.0.0.1:80 rather than using the cluster IP.

# kubectl get svc -A
NAMESPACE         NAME                               TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                  AGE
default           kubernetes                         ClusterIP   10.233.0.1     <none>        443/TCP                  30m
docker-registry   docker-registry                    ClusterIP   10.233.22.50   <none>        80/TCP                 17m
headlamp          headlamp                           ClusterIP   10.233.49.34   <none>        80/TCP                   27m
kube-system       coredns                            ClusterIP   10.233.0.3     <none>        53/UDP,53/TCP,9153/TCP   29m
tailscale         ts-docker-registry-ingress-4w69x   ClusterIP   None           <none>        <none>                   16m
tailscale         ts-headlamp-ingress-6mzrv          ClusterIP   None           <none>        <none>                   17m
# kubectl get ingress -A
NAMESPACE         NAME                      CLASS       HOSTS   ADDRESS                                      PORTS     AGE
docker-registry   docker-registry-ingress   tailscale   *       berkeley-staging-docker.taila1eba.ts.net     80, 443   36m
headlamp          headlamp-ingress          tailscale   *       berkeley-staging-headlamp.taila1eba.ts.net   80, 443   36m

I am able to connect to both services simultaneously using kubectl port-forward, so I'm pretty sure this is not an inherit limitation of my kubernetes set-up.

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

No response

OS

Linux

OS version

kubernetes

Tailscale version

1.62.1

Other software

calico CNI

Bug report

BUG-893cd6c8f00a44fda54bd05672e58e601316cc21d2892024305ff3afa3f4c675-20240509231821Z-5c7d259ff6d31521

@garymm garymm changed the title k8s operator: tailscale ingress uses 127.0.0.1 rather than ClusterIP, causing failure when two services use the same port k8s operator: tailscale ingress sometimes fails with "netstack: could not connect to local server at ..." May 9, 2024
@garymm
Copy link
Author

garymm commented May 13, 2024

I'm seeing this again. I tried upgradng to 1.64.2 (latest helm chart) and deleting all the pods and this time I can't figure out a way to fix it.

@garymm garymm changed the title k8s operator: tailscale ingress sometimes fails with "netstack: could not connect to local server at ..." k8s operator: tailscale ingress sometimes tries to connect to 127.0.0.1 instead of ClusterIP, fails with "netstack: could not connect to local server at ..." May 13, 2024
@garymm
Copy link
Author

garymm commented May 13, 2024

Restarting the kubernetes host seems to have fixed it, at least for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants