Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can not access pod from pod on different node #1098

Closed
lvthillo opened this issue Feb 6, 2019 · 8 comments
Closed

can not access pod from pod on different node #1098

lvthillo opened this issue Feb 6, 2019 · 8 comments
Labels

Comments

@lvthillo
Copy link

lvthillo commented Feb 6, 2019

Expected Behavior

I have installed Kubernetes on Ubuntu 18. When I configure calico it works but I want to use flannel.
When I use flanned I'm not able to ping or curl to a pod on a different node. It works to a pod on the same node like here:

kubectl exec busybox -- curl 10.244.2.4
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   612  100   612    0     0   357k      0 --:--:-- --:--:-- --:--:--  597k
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Version:

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.5", GitCommit:"51dd616cdd25d6ee22c83a858773b607328a18ec", GitTreeState:"clean", BuildDate:"2019-01-16T18:14:49Z", GoVersion:"go1.10.7", Compiler:"gc", Platform:"linux/amd64"}
$ uname -r
4.15.0-43-generic

Current Behavior

Can not access pod on different node

kubectl exec busybox -- curl 10.244.1.4
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 10.244.1.4 port 80: No route to host
command terminated with exit code 7

Steps to Reproduce (for bugs)

Installed flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

The cluster is running fine:

kubectl get cs
NAME                 STATUS    MESSAGE              ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-0               Healthy   {"health": "true"}
$ kubectl get pods -n kube-system
NAME                            READY   STATUS    RESTARTS   AGE
coredns-576cbf47c7-q7ncm        1/1     Running   1          30m
coredns-576cbf47c7-tclp8        1/1     Running   1          30m
etcd-kube1                      1/1     Running   1          30m
kube-apiserver-kube1            1/1     Running   1          30m
kube-controller-manager-kube1   1/1     Running   1          30m
kube-flannel-ds-amd64-6vlkx     1/1     Running   1          30m
kube-flannel-ds-amd64-7twk8     1/1     Running   1          30m
kube-flannel-ds-amd64-rqzr7     1/1     Running   1          30m
kube-proxy-krfzk                1/1     Running   1          30m
kube-proxy-vrssw                1/1     Running   1          30m
kube-proxy-xlrgz                1/1     Running   1          30m
kube-scheduler-kube1            1/1     Running   1          30m

Cluster is created with: kubeadm init --apiserver-advertise-address=192.168.20.101 --pod-network-cidr=10.244.0.0/16

Firewall:

sudo iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-N KUBE-EXTERNAL-SERVICES
-N KUBE-FIREWALL
-N KUBE-FORWARD
-N KUBE-SERVICES
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -s 10.244.0.0/16 -j ACCEPT
-A FORWARD -d 10.244.0.0/16 -j ACCEPT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -s 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -d 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

I tried to add iptables --policy FORWARD ACCEPT on every node but it didn't help.

additional info:

$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1


$ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
@PatrickKutch
Copy link

I have the exact same issue, though I'm running 1.13.0. Calico works fine, but wanted to use flannel. just trying a simple pod to pod ping with no luck.

@lvthillo
Copy link
Author

lvthillo commented Feb 8, 2019

@PatrickKutch Are you also using ubuntu 18? I managed to make it work by adding the network interface to the flannel.yml file:

$ curl -o kube-flannel.yml https://raw.githubusercontent.com/coreos/flannel/v0.11.0/Documentation/kube-flannel.yml
$ sed -i.bak 's|        - --ip-masq|        - --ip-masq\n        - --iface=enp0s8|' kube-flannel.yml
$ kubectl create -f kube-flannel.yml

The sed command is adding --iface=enp0s8 to the flannel args:

...
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --iface=enp0s8
        - --kube-subnet-mgr
...

Hope this solves it for you too.

@PatrickKutch
Copy link

I am on Ubuntu 18. I am glad you got it working for yourself, however since I've many nodes in my cluster, with different networking devices on them, I don't see how specifying a device will work for me :-(

@lvthillo
Copy link
Author

lvthillo commented Feb 8, 2019

@PatrickKutch Aren't it all Ubuntu 18 nodes with the enp0s8 network? ("the eth0 of Ubuntu 18").

@PatrickKutch
Copy link

Nope. None of my 20 boxes have that. Depends on what slot NIC sits and some other stuff.

@IAXES
Copy link

IAXES commented May 7, 2019

Good day. Posting in case this helps others. From #122:

sudo systemctl stop docker
sudo iptables -t nat -F
sudo ip link del docker0
sudo systemctl start docker

I put this into a script I use to re-deploy the cluster (i.e. kubeadm reset, kubeadm init ..., etc.), and it resolved two issues I was encountering:

  • Can't curl pods on different nodes.
  • Cluster DNS running but not functioning (i.e. couldn't resolve kubernetes.default or any other DNS names on the cluster network).

Cheers!

@alun
Copy link

alun commented Feb 4, 2022

Having very similar problem two Ubuntus 20.04. One master one worker.
Kubernetes v1.23.3

kubelet logs -n kube-system kube-flannel-ds-...

doesn't show any errors.

Using VXLAN backends - VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false

nmap -sU -p 8473 ...

shows ports open on both public IPs.

One of my nodes is behind NAT - but I've made a few steps to make flannel aware of it and use public IP interfaces on both of them.

Nodes are in different hosting providers.

I know VXLAN is probably not recommended in this setup since the encapsulated traffic is UDP and not encrypted.

But still, should it work?

I'm probably going to try other backends and see if they work.

None of the advice in this thread as well as in troubleshooting helped in my case so far.

@stale
Copy link

stale bot commented Jan 25, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 25, 2023
@stale stale bot closed this as completed Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants