Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodes with multiple network interfaces can fail to talk to services #102

Closed
pires opened this issue Jan 4, 2017 · 64 comments · Fixed by kubernetes/kubernetes#39440
Closed
Assignees
Labels
area/ecosystem kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@pires
Copy link
Contributor

pires commented Jan 4, 2017

UPDATE as of Feb 7th, 2018 by request of @bboreham I've edited the title so to not mislead people looking for unrelated issue.

As reported by @damaspi:

When I deploy some demo application, I have the same message as above. (Error syncing pod, skipping: failed to "SetupNetwork" ).

When I check the logs of the proxy pod, kubectl logs kube-proxy-g7qh1 --namespace=kube-system I get the following info: proxier.go:254] clusterCIDR not specified, unable to distinguish between internal and external traffic

@pires pires changed the title Can kube-proxy can't distinguish internal and external traffic Jan 4, 2017
@pires
Copy link
Contributor Author

pires commented Jan 4, 2017

@pires
Copy link
Contributor Author

pires commented Jan 4, 2017

@damaspi I have opened this issue and provided a fix. Waiting on feedback!

Also, moving to userspace mode brings quite of a performance penalty.

@damaspi
Copy link

damaspi commented Jan 4, 2017

Sorry, I commented in the wrong issue.
Thanks for the fix. I'll not be able to test it soon though (was working on this during holidays, and I am back to work), and I was using only the official stable version (so I have not the environment to build it).

@damaspi
Copy link

damaspi commented Jan 4, 2017

I copied it here now, and delete it in the other...

I worked-around temporarily by configuring proxy-mode to userspace but any advice welcome...

(inspired by this issue )

kubectl -n kube-system get ds -l "component=kube-proxy" -o json | jq ".items[0].spec.template.spec.containers[0].command |= .+ [\"--proxy-mode=userspace\"]" | kubectl apply -f - && kubectl -n kube-system delete pods -l "component=kube-proxy"

@pires
Copy link
Contributor Author

pires commented Jan 4, 2017

Again, @damaspi

Also, moving to userspace mode brings quite of a performance penalty.

@bvandewalle
Copy link

I had the same issue.
My Kube-Proxy would not install the Service related rules, making any service unavailable from the pods.

My fix was to modify the Kubeadm DaemonSet for kube-proxy and add explicitely the --cluster-cidr = option.

@pires
Copy link
Contributor Author

pires commented Jan 19, 2017

/cc @luxas

@mikedanese
Copy link
Member

@spxtr you are closing a bunch of issues in this repo

@pires
Copy link
Contributor Author

pires commented Feb 2, 2017

@mikedanese PRs being merged and there was a PR merged that fixed the lack of --cluster-cidr flag in controller-manager.

@mikedanese
Copy link
Member

@pires, the merge of the PR in the main repo is not what closed this PR. It was the merge in @spxtr's branch. That's what concerns me.

@pires
Copy link
Contributor Author

pires commented Feb 2, 2017

Ah I've seen it before indeed.

@ronaldpetty
Copy link

I have seen this on 1.5.2. I manually building a cluster (to learn.) . I am unclear what the fix is, as there is mention of controller-manager and daemon set. That implies to me that people are launching kube-proxy via a daemon-set. Just to clarify, the actual fix is to add the flag (--cluster-cidr) to kube-proxy correct? Just trying to make sure I am not missing something. Also, just to clear my memory, didn't kube-proxy use to get this from the kube-apiserver? Was it always needed, I can't remember. If it doesn't, can someone clarify the difference between --service-cluster-ip-range=10.0.0.0/16 (api) and --cluster-cidr (proxy)? Thanks. (sorry to add here, not sure where else to ask for this issue.)

@pires
Copy link
Contributor Author

pires commented Feb 10, 2017

Where did the API server exposed the cluster pod CIDR? This was a misconception on my side as well.

@ronaldpetty
Copy link

Hi @pires, I thought . --service-cluster-ip-range=10.0.0.0/16 on the api-server set it all up as the proxies would talk to the k8s server to get that information. --cluster-cidr maybe was to do a subset of --service-cluster-ip-range, else it seems redundant or there is a use case that I am unclear about (or I just don't know what I am talking about, which could be true!)

@pires
Copy link
Contributor Author

pires commented Feb 10, 2017

Service CIDR is the subnet used for virtual IPs (used by kube-proxy). Problem is kube-proxy doesn't know about pod network CIDR, which is different than service CIDR.

@ronaldpetty
Copy link

Ah, so would that be the overlay?

@bamb00
Copy link

bamb00 commented Feb 22, 2017

Would this issue cause communication between pod and api-server? For example if I was to run the curl command from a kube pod to apiserver "curl https://10.96.0.1:443/api" result:> curl: (7) Failed to connect to 10.96.0.1 port 443: Connection timed out...

@thockin
Copy link
Member

thockin commented May 31, 2017

I just had a look at the clusterCIDR logic in kube-proxy, and I agree that is a weird corner case.

I agree the static route is appropriate for the 2nd interface, but it's unfortunate. It feels like the kernel should be smarter than that.

@bamb00
Copy link

bamb00 commented Jun 6, 2017

I'm running v1.6.1 and thought the error "clusterCIDR not specified, unable to distinguish between internal and external traffic" would be address.

2017-06-06T17:49:17.113224501Z I0606 17:49:17.112870 1 server.go:225] Using iptables Proxier.
2017-06-06T17:49:17.139584294Z W0606 17:49:17.139190 1 proxier.go:309] clusterCIDR not specified, unable to distinguish between internal and external traffic
2017-06-06T17:49:17.139607413Z I0606 17:49:17.139223 1 server.go:249] Tearing down userspace rules.
2017-06-06T17:49:17.251412491Z I0606 17:49:17.251115 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 524288
2017-06-06T17:49:17.252499164Z I0606 17:49:17.252359 1 conntrack.go:66] Setting conntrack hashsize to 131072
2017-06-06T17:49:17.253220249Z I0606 17:49:17.253057 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
2017-06-06T17:49:17.253246216Z I0606 17:49:17.253124 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600

@timchenxiaoyu
Copy link

how to define the internal and external traffic ?

@thockin
Copy link
Member

thockin commented Jun 9, 2017 via email

@kfox1111
Copy link

kfox1111 commented Jul 5, 2017

I've seen this problem too. a route to the pod network to the second nic resolved the issue for me. Feels a little fragile though.....

berryjam pushed a commit to berryjam/kubernetes that referenced this issue Aug 18, 2017
@bamb00
Copy link

bamb00 commented Sep 14, 2017

Hi,

I'm running Kubernetes v1.6.6 & v1.7.0 kube-proxy. Getting the same error,

kube-proxy:

   W0914 00:15:41.627710       1 proxier.go:298] clusterCIDR not specified, unable to distinguish between internal and external traffic

Kubernetes version:

   Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:34:20Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
   Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}

Try the workaround from @damaspi but failed in v1.6.6 and v1.7.0 use to work in v1.5.4.

  # kubectl -n kube-system get ds -l "component=kube-proxy" -o json | jq '.items[0].spec.template.spec.containers[0].command |= .+ ["--cluster-cidr=10.96.0.0/12"]' | kubectl apply -f - && kubectl -n kube-system delete pods -l "component=kube-proxy"
  
    error: error validating "STDIN": error validating data: items[0].apiVersion not set; if you choose to ignore these errors, turn validation off with --validate=false

Need guidance to resolve in v1.6.6 & v1.7.0. Thanks.

@jamiehannaford
Copy link
Contributor

@bboreham

I don't really know what kubeadm could do, since the solution seems to relate to the underlying network. Maybe add options to inform your desired "public interface" and "private interface" and have kubeadm recommend network config changes?

I don't think kubeadm should be spitting out OS or distro-specific configuration instructions for host networking. I think it's the responsibility of the operator to configure their host appropriately because otherwise it becomes a rabbit hole. We can certainly make it a requirement, though.

What should kubeadm expect for things to work? That if the user wants to use a non-default NIC, they need to add a static route in Linux? Is this a general enough use-case for us to add it as a system requirement?

@jamiehannaford
Copy link
Contributor

@bboreham Any ideas on how we can improve our documentation here? Otherwise I'm in favour of closing this because:

  1. it seems to relate to a user's network environment, not kubeadm
  2. there's no single way to clarify those expectations

@bboreham
Copy link

bboreham commented Nov 2, 2017

[Aside: it bugs me I have to read up and down and through other issues to page the context back in. The problem people wanted resolved is absolutely nothing to do with the title of this issue]

In the setup docs you could say "if you have more than one network adapter, and your Kubernetes components are not reachable on the default route, we recommend you add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter".

@jamiehannaford
Copy link
Contributor

[Aside: it bugs me I have to read up and down and through other issues to page the context back in. The problem people wanted resolved is absolutely nothing to do with the title of this issue]

You are not the only one! 😅

In the setup docs you could say "if you have more than one network adapter, and your Kubernetes components are not reachable on the default route, we recommend you add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter".

Cool, I'll try to submit a docs PR for this tomorrow and close this out.

@jamiehannaford
Copy link
Contributor

This is now documented in kubernetes/website#6265, so I'm going to close.

This issue seems to track a few different problems at once, so if you're still running into a potential bug, please open a new issue so can better target the root cause.

@mindscratch
Copy link

FWIW, if you use kubeadm to start the cluster, if you specify the "pod-network-cidr", that'll get passed to the kube-proxy when it starts as the "cluster-cidr". For example, weave defaults to using "10.32.0.0/12"...so I used kubeadm init --kubernetes-version=v.1.8.4 --pod-network-cidr=10.32.0.0/12 which started kube-proxy with cluster-cidr=10.32.0.0/12

@bamb00
Copy link

bamb00 commented Nov 29, 2017

@bboreham I'm new to this...Would there be an example on how to implement your suggestion "add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter"?

@bboreham
Copy link

@bamb00 scroll up; there is an example at #102 (comment)

Caution: if you make a wrong step it may will result in your machine being inaccessible. Generally this will come back after a reboot, unless you configured the bad route to be there on startup.

I do not know an easy way to learn Linux network configuration.

@bboreham
Copy link

@mindscratch do note this issue has nothing to do with "cluster-cidr"; that was a red herring eliminated around seven months ago. Please open a new issue if you are having new problems.

@pires pires changed the title kube-proxy can't distinguish internal and external traffic nodes with multiple network interfaces can fail to talk to services Feb 7, 2018
@SpComb
Copy link

SpComb commented Mar 6, 2018

Semi-serious suggestion for fixing this specific case without requiring the kube-proxy to use ! -s $podCIDR to distinguish host source address:

$ sudo ip ro add local 10.96.0.0/12 table local dev lo
$ sudo iptables -t nat -I KUBE-SERVICES -s 10.96.0.0/12 -d 10.96.0.0/12 -j KUBE-MARK-MASQ

(or possibly some variation with an explicit ... src 10.96.0.0 on the local route... the table local is probably also unnecessary and a bad idea)

$ ip ro get 10.96.0.1
local 10.96.0.1 dev lo  src 10.96.0.1 
    cache <local> 
$ curl -vk https://10.96.0.1
...
* Connected to 10.96.0.1 (10.96.0.1) port 443 (#0)
11:32:20.671085 0c:c4:7a:54:0a:e6 > 44:aa:50:04:3d:00, ethertype IPv4 (0x0800), length 74: 10.80.4.149.59334 > 10.80.4.147.6443: Flags [S], seq 2286812584, win 43690, options [mss 65495,sackOK,TS val 209450 ecr 0,nop,wscale 8], length 0
11:32:20.671239 44:aa:50:04:3d:00 > 0c:c4:7a:54:0a:e6, ethertype IPv4 (0x0800), length 74: 10.80.4.147.6443 > 10.80.4.149.59334: Flags [S.], seq 1684666695, ack 2286812585, win 28960, options [mss 1460,sackOK,TS val 208877 ecr 209450,nop,wscale 8], length 0
11:32:20.671315 0c:c4:7a:54:0a:e6 > 44:aa:50:04:3d:00, ethertype IPv4 (0x0800), length 66: 10.80.4.149.59334 > 10.80.4.147.6443: Flags [.], ack 1, win 171, options [nop,nop,TS val 209450 ecr 208877], length 0

However, I have no idea if that covers all of the expected behaviors of those source-specific kube-proxy MASQ rules...

EDIT: this also has all kinds of side-effects for connections to unconfigured service VIPs... they will end up connecting to any matching host network namespace services.

EDIT2: However, even that is probably better than the current behavior of leaking connections to unconfigured 10.96.X.Y service VIPs out via the default route... which is vaguely unsettling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.