Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network between kubernetes PODs is down after one flanneld is stopped and datastore can't be reached #636

Closed
Dieken opened this issue Mar 12, 2017 · 6 comments

Comments

@Dieken
Copy link
Contributor

Dieken commented Mar 12, 2017

I'm experimenting kinds of failure in kubernetes cluster, I found a strange problem.

My steps:

  1. start 4 nodes kubernetes cluster with kubeadm and vagrant, use flannel vxlan for network among PODs.
  2. stop all etcd containers so that kubernetes apiserver stops working -- expected.
  3. suppose node01 (192.168.200.201) with POD cidr 172.16.0.0/24, and node04(192.168.200.204) with POD cidr 172.16.3.0/24. After etcd and apiserver stops working, flanneld on node01 and node04 still run, this is excellent, I can ping from node01 to a container(172.16.3.10) on node04, also from a container on node01, both work, very good!
  4. Now I stop the flanneld on node04, and not stop flanneld on node01. Ping from node01 to the container(172.16.3.10) on node04 still work, but ping from the container on node01 to 172.16.3.10 on node04 doesn't work any more, why?

Ping from node01 to containers on node02 and node03 still work.

I suppose flanneld is for network control plane, its exit shouldn't interrupt the data plane, because the flannel.1 vxlan interface and cni0 bridge on node04 still exist after flanneld on node04 exit.

I tried to set rp_filter to 0 on all interfaces of node01/node04, didn't help.

The host OS is latest Ubuntu 16.04, the ubuntu/xenial box shipped by Vagrant.
Kubernetes v1.5.4 and Flannel v0.7.0.

@Dieken Dieken changed the title network between kubernetes PODs is down after one flanned is stopped network between kubernetes PODs is down after one flanneld is stopped Mar 12, 2017
@Dieken
Copy link
Contributor Author

Dieken commented Mar 12, 2017

bad network from container on node01 to container on node04, iptables trace in /var/log/syslog on node04:

Mar 12 17:21:14 node04 kernel: [ 3452.909716] TRACE: raw:PREROUTING:policy:2 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909724] TRACE: nat:PREROUTING:rule:1 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909732] TRACE: nat:KUBE-SERVICES:return:6 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909736] TRACE: nat:PREROUTING:policy:3 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909742] TRACE: filter:FORWARD:rule:1 IN=flannel.1 OUT=cni0 MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909746] TRACE: filter:DOCKER-ISOLATION:return:1 IN=flannel.1 OUT=cni0 MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909750] TRACE: filter:FORWARD:policy:6 IN=flannel.1 OUT=cni0 MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909753] TRACE: nat:POSTROUTING:rule:1 IN= OUT=cni0 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909756] TRACE: nat:KUBE-POSTROUTING:return:2 IN= OUT=cni0 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909759] TRACE: nat:POSTROUTING:rule:3 IN= OUT=cni0 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909761] TRACE: nat:POSTROUTING:policy:6 IN= OUT=cni0 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909799] TRACE: raw:PREROUTING:policy:2 IN=cni0 OUT= PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909808] TRACE: filter:FORWARD:rule:1 IN=cni0 OUT=flannel.1 PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909812] TRACE: filter:DOCKER-ISOLATION:return:1 IN=cni0 OUT=flannel.1 PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node04 kernel: [ 3452.909816] TRACE: filter:FORWARD:policy:6 IN=cni0 OUT=flannel.1 PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1

Mar 12 17:21:20 node04 kernel: [ 3458.914902] TRACE: raw:OUTPUT:policy:2 IN= OUT=cni0 SRC=172.16.3.1 DST=172.16.3.12 LEN=112 TOS=0x00 PREC=0xC0 TTL=64 ID=11995 PROTO=ICMP TYPE=3 CODE=1 [SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1 ]
Mar 12 17:21:20 node04 kernel: [ 3458.914925] TRACE: filter:OUTPUT:rule:1 IN= OUT=cni0 SRC=172.16.3.1 DST=172.16.3.12 LEN=112 TOS=0x00 PREC=0xC0 TTL=64 ID=11995 PROTO=ICMP TYPE=3 CODE=1 [SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1 ]
Mar 12 17:21:20 node04 kernel: [ 3458.914934] TRACE: filter:KUBE-SERVICES:return:1 IN= OUT=cni0 SRC=172.16.3.1 DST=172.16.3.12 LEN=112 TOS=0x00 PREC=0xC0 TTL=64 ID=11995 PROTO=ICMP TYPE=3 CODE=1 [SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1 ]
Mar 12 17:21:20 node04 kernel: [ 3458.914940] TRACE: filter:OUTPUT:rule:2 IN= OUT=cni0 SRC=172.16.3.1 DST=172.16.3.12 LEN=112 TOS=0x00 PREC=0xC0 TTL=64 ID=11995 PROTO=ICMP TYPE=3 CODE=1 [SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1 ]
Mar 12 17:21:20 node04 kernel: [ 3458.914947] TRACE: filter:KUBE-FIREWALL:return:2 IN= OUT=cni0 SRC=172.16.3.1 DST=172.16.3.12 LEN=112 TOS=0x00 PREC=0xC0 TTL=64 ID=11995 PROTO=ICMP TYPE=3 CODE=1 [SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1 ]
Mar 12 17:21:20 node04 kernel: [ 3458.914954] TRACE: filter:OUTPUT:policy:3 IN= OUT=cni0 SRC=172.16.3.1 DST=172.16.3.12 LEN=112 TOS=0x00 PREC=0xC0 TTL=64 ID=11995 PROTO=ICMP TYPE=3 CODE=1 [SRC=172.16.3.12 DST=172.16.0.7 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=27538 PROTO=ICMP TYPE=0 CODE=0 ID=492 SEQ=1 ]

trace on node01, no ICMP reply:

Mar 12 17:21:14 node01 kernel: [ 3453.096969] TRACE: raw:PREROUTING:policy:2 IN=cni0 OUT= PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.096976] TRACE: nat:PREROUTING:rule:1 IN=cni0 OUT= PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.096985] TRACE: nat:KUBE-SERVICES:return:6 IN=cni0 OUT= PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.096989] TRACE: nat:PREROUTING:policy:3 IN=cni0 OUT= PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097006] TRACE: filter:FORWARD:rule:1 IN=cni0 OUT=flannel.1 PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097011] TRACE: filter:DOCKER-ISOLATION:return:1 IN=cni0 OUT=flannel.1 PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097014] TRACE: filter:FORWARD:policy:6 IN=cni0 OUT=flannel.1 PHYSIN=vethf912d941 MAC=0a:58:ac:10:00:01:0a:58:ac:10:00:07:08:00 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097017] TRACE: nat:POSTROUTING:rule:2 IN= OUT=flannel.1 PHYSIN=vethf912d941 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097021] TRACE: nat:KUBE-POSTROUTING:return:2 IN= OUT=flannel.1 PHYSIN=vethf912d941 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097023] TRACE: nat:POSTROUTING:rule:3 IN= OUT=flannel.1 PHYSIN=vethf912d941 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1
Mar 12 17:21:14 node01 kernel: [ 3453.097026] TRACE: nat:POSTROUTING:policy:6 IN= OUT=flannel.1 PHYSIN=vethf912d941 SRC=172.16.0.7 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=56818 DF PROTO=ICMP TYPE=8 CODE=0 ID=492 SEQ=1

good network from node01 host to container on node04, iptables trace in /var/log/system on node04:

Mar 12 17:26:36 node04 kernel: [ 3774.709092] TRACE: raw:PREROUTING:policy:2 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709099] TRACE: nat:PREROUTING:rule:1 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709107] TRACE: nat:KUBE-SERVICES:return:6 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709154] TRACE: nat:PREROUTING:policy:3 IN=flannel.1 OUT= MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709162] TRACE: filter:FORWARD:rule:1 IN=flannel.1 OUT=cni0 MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709166] TRACE: filter:DOCKER-ISOLATION:return:1 IN=flannel.1 OUT=cni0 MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709169] TRACE: filter:FORWARD:policy:6 IN=flannel.1 OUT=cni0 MAC=fe:bf:08:13:d7:4a:ea:f6:36:35:fd:f9:08:00 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709172] TRACE: nat:POSTROUTING:rule:1 IN= OUT=cni0 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709176] TRACE: nat:KUBE-POSTROUTING:return:2 IN= OUT=cni0 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709178] TRACE: nat:POSTROUTING:rule:3 IN= OUT=cni0 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709181] TRACE: nat:POSTROUTING:policy:6 IN= OUT=cni0 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709218] TRACE: raw:PREROUTING:policy:2 IN=cni0 OUT= PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709226] TRACE: filter:FORWARD:rule:1 IN=cni0 OUT=flannel.1 PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709230] TRACE: filter:DOCKER-ISOLATION:return:1 IN=cni0 OUT=flannel.1 PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node04 kernel: [ 3774.709234] TRACE: filter:FORWARD:policy:6 IN=cni0 OUT=flannel.1 PHYSIN=veth63e31496 MAC=0a:58:ac:10:03:01:0a:58:ac:10:03:0c:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1

trace on node01, got ICMP reply:

Mar 12 17:26:36 node01 kernel: [ 3774.885906] TRACE: raw:OUTPUT:policy:2 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885914] TRACE: nat:OUTPUT:rule:1 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885919] TRACE: nat:KUBE-SERVICES:return:6 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885922] TRACE: nat:OUTPUT:policy:3 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885925] TRACE: filter:OUTPUT:rule:1 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885928] TRACE: filter:KUBE-SERVICES:return:1 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885931] TRACE: filter:OUTPUT:rule:2 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885934] TRACE: filter:KUBE-FIREWALL:return:2 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885936] TRACE: filter:OUTPUT:policy:3 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885939] TRACE: nat:POSTROUTING:rule:2 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885942] TRACE: nat:KUBE-POSTROUTING:return:2 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885945] TRACE: nat:POSTROUTING:rule:3 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.885948] TRACE: nat:POSTROUTING:policy:6 IN= OUT=flannel.1 SRC=172.16.0.0 DST=172.16.3.12 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=40486 DF PROTO=ICMP TYPE=8 CODE=0 ID=27983 SEQ=1 UID=0 GID=0
Mar 12 17:26:36 node01 kernel: [ 3774.897055] TRACE: raw:PREROUTING:policy:2 IN=flannel.1 OUT= MAC=ea:f6:36:35:fd:f9:fe:bf:08:13:d7:4a:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node01 kernel: [ 3774.897063] TRACE: filter:INPUT:rule:1 IN=flannel.1 OUT= MAC=ea:f6:36:35:fd:f9:fe:bf:08:13:d7:4a:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node01 kernel: [ 3774.897067] TRACE: filter:KUBE-FIREWALL:return:2 IN=flannel.1 OUT= MAC=ea:f6:36:35:fd:f9:fe:bf:08:13:d7:4a:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1
Mar 12 17:26:36 node01 kernel: [ 3774.897070] TRACE: filter:INPUT:policy:2 IN=flannel.1 OUT= MAC=ea:f6:36:35:fd:f9:fe:bf:08:13:d7:4a:08:00 SRC=172.16.3.12 DST=172.16.0.0 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=31680 PROTO=ICMP TYPE=0 CODE=0 ID=27983 SEQ=1

@Dieken
Copy link
Contributor Author

Dieken commented Mar 13, 2017

The network topology, probably not very accurate, especially the relationship between veth and cni0, but it should be enough to understand the network data flow.

image

@Dieken
Copy link
Contributor Author

Dieken commented Mar 13, 2017

I got it. When flanneld on node04 stopped and couldn't start because kubernetes apiserver couldn't work without Etcd up, there was nobody(it was flanneld on node04) to automatically inject ARP table of flannel vxlan interface on node04 with node01's POD IPs to node01's flannel vxlan interface's MAC. So all PODs on nodes except node04 couldn't be reached from node04 due to ARP miss. This can be confirmed by this command on node04:

sudo arp -i flannel.1 -s 172.16.0.3  MAC-of-flannel.1-on-node01

Then ping from 172.16.0.3 to 172.16.3.10 works.

@Dieken
Copy link
Contributor Author

Dieken commented Mar 13, 2017

I feel it's better flanneld checks bridge fdb and subnet lease before it exits due to broken k8s apiserver. If the fdb and subnet lease are valid, flanneld can do its best to keep injecting ARP table.

@tomdee tomdee changed the title network between kubernetes PODs is down after one flanneld is stopped network between kubernetes PODs is down after one flanneld is stopped and datastore can't be reached Mar 22, 2017
@tomdee
Copy link
Contributor

tomdee commented Nov 3, 2017

The vxlan code was significantly changed in the last couple of releases so I don't think this is till a problem.

@tomdee tomdee closed this as completed Nov 3, 2017
@Dieken
Copy link
Contributor Author

Dieken commented Nov 5, 2017

@tomdee

Thank you very much!!! That's so awesome!!! I just verified, flanneld now injects permanet ARP table entries for each pod subnets of other nodes, so exit of flanneld won't affect the communication among pods any more.

image

I have 8 nodes, the picture was captured from a node with pod subnet 172.29.2.0/24.

root@k8s-dev-a04:~# uname -a
Linux k8s-dev-a04 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 14:24:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@k8s-dev-a04:~# lsb_release -a
LSB Version:	core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.3 LTS
Release:	16.04
Codename:	xenial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants