Skip to content
This repository has been archived by the owner on Mar 28, 2020. It is now read-only.

Reverse DNS look-ups are inconsistent #2160

Open
bennycooly opened this issue Feb 14, 2020 · 1 comment
Open

Reverse DNS look-ups are inconsistent #2160

bennycooly opened this issue Feb 14, 2020 · 1 comment

Comments

@bennycooly
Copy link

I have deployed etcd-operator with helm and have the following cluster spec:

apiVersion: etcd.database.coreos.com/v1beta2
kind: EtcdCluster
metadata:
  name: coredns-etcd-cluster
spec:
  size: 3

From my understanding based on the documentation here, etcd with TLS enabled will do a reverse lookup based on the ip address of the etcd pod to check if the incoming request is valid. However, when I run nslookup <PEER_IP_ADDR> from an etcd pod, I get inconsistent results:

/ # nslookup 10.11.3.99
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.11.3.99
Address 1: 10.11.3.99 10-11-3-99.coredns-etcd-cluster-client.dns.svc.cluster.local
/ # nslookup 10.11.3.99
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.11.3.99
Address 1: 10.11.3.99 coredns-etcd-cluster-t9rjxhtc96.coredns-etcd-cluster.dns.svc.cluster.local

Half the time, the reverse lookup will give the incorrect client service DNS name of the form pod-ip.coredns-etc-cluster-client.*. This will cause the peer TLS communication to fail since this is not of the form *.coredns-etcd-cluster.*.

I first discovered this on a newly created k8s cluster (v1.17.2) when trying to deploy Cilium with the managed etcd. Cilium internally uses the etcd-operator to create their etcd cluster and I saw the etcd pod logs flooded with these messages:

2020-02-14 03:29:44.693313 I | embed: rejected connection from "10.11.4.148:53696" (error "tls: \"10.11.4.148\" does not match any of DNSNames [\"*.cilium-etcd.kube-system.svc\" \"*.cilium-etcd.kube-system.svc.cluster.local\"]", ServerName "cilium-etcd-8svdg9rhbc.cilium-etcd.kube-system.svc", IPAddresses [], DNSNames ["*.cilium-etcd.kube-system.svc" "*.cilium-etcd.kube-system.svc.cluster.local"])

So I created my own etcd operator deployment and validated that from one etcd pod, a reverse lookup for the IP address of a peer etcd pod will return different values.

The only time that the reverse DNS lookup is consistent is when the pod is looking up its own DNS name since it is written into /etc/hosts.

Can somebody please help investigate to see if they can replicate this and if this issue lies with how etcd-operator is creating the etcd pods?

@bmcustodio
Copy link
Contributor

I am also facing this issue (although I always get the same, but "wrong", result for reverse lookups). I have another cluster running an older version of CoreDNS (1.3.1) where this issue never happenned, so I thought it might be related to the version of CoreDNS in use (1.6.6). Turns out, it seems, that CoreDNS >= 1.6.0 will exhibit this behaviour, while CoreDNS <= 1.5.2 won't:

CoreDNS 1.5.2:

/ # host 10.150.27.157
157.27.150.10.in-addr.arpa domain name pointer cilium-etcd-fsbmhdkzgk.cilium-etcd.cilium.svc.cluster.local.

CoreDNS 1.6.0:

/ # host 10.150.27.157
157.27.150.10.in-addr.arpa domain name pointer 10-150-27-157.cilium-etcd-client.cilium.svc.cluster.local.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants