-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coredns CrashLoopBackOff due to dnsmasq #1292
Comments
please provide logs from the CoreDNS pods. google reveils numerous reports that this is problematic on Ubuntu: in 1.12.x you can also try |
@aravind-murthy Do you happen to run NetworkManager or dnsmasq on your machine? |
Hi @bart0sh - please advise how do i check for NM and dnsmasq? This is a brand new VM, so i haven't configured anything explicitly (as far as i am aware) unless it already comes enabled by default in the OS. However, i can check and report back. |
@aravind-murthy Please check if you have /etc/NetworkManager/NetworkManager.conf and what is mentioned there as dns, e.g. dns=dnsmasq. You can also check if dnsmasq process is in the 'ps aux' output or/and its service status: 'sudo systemctl status dnsmasq' |
|
|
|
^ you need to add |
|
once the container enters |
@aravind-murthy that's interesting. The service is enabled in the network manager, but not running. Please, comment out 'dns=dnsmasq' line in the config and restart network manager: sudo systemctl restart network-manager Then restore original name servers in /etc/resolv.conf (they're probably commented out there). That should help core-dns pods when they're restarted by kubelet. |
I was able to reproduce this on my machine with kubeadm 1.12.3:
|
|
I havent commented anything out in this file (/etc/resolv.conf). It currently has following entries:
|
@aravind-murthy can you comment out line with dnsmasq address: nameserver 127.0.1.1 |
@neolit123 my case is different. core-dns pods are in pending state, logs are empty |
|
Yes, this works as I noted in the "Anything else we need to know section" but it looks like its a hacky solution.
|
you can also try kube-dns, but it seems to me like Ubuntu is doing things wrong. |
sadly there can be more than one reason for a pods' |
@aravind-murthy Can you look if 'nameserver 127.0.1.1' line is still commented out in your /etc/resolv.conf? It could be rewritten and that can cause dns loop and core-dns crash |
@chrisohaver what is the official way for solving the loop issues that are present in stock Ubuntu 16.04 that we should include in our TS guide? |
@aravind-murthy in your setup instructions you mentioned that you did the following:
Before installing flannel via |
@neolit123, IMO the best way fix the DNS self-forwarding loop issue, is to fix the underlying deployment issue in kubelet: that kubelet is sending invalid upstream servers to Pods with dnsPolicy="default". In effect, make kubelet use a copy of The tricky bit is locating where the "real"
A nuclear option would be to manually create a |
thanks a lot for the details @chrisohaver ! we need to add this info to the troubleshooting guide. page to update: |
I did not, thanks for pointing that out. I will retry on a clean VM once more and advise shortly. |
@neolit123 in this particular case we need to figure out how to disable dnsmasq completely. Looks like disabling it in network manager is not enough and dhcp client or other piece of software puts 'nameserver 127.0.1.1' back into it, which triggers loop detection in core-dns. |
i'm realizing that we automate what is passed to the kubelet depending on systemd-resolved and that's problematic. but instructing the user to write a custom resolv.conf is still an option too, i guess.
if we use the default resolv.conf, we need to disable the automatic addition of these loopback nameservers. so steps are::
|
I've proposed a PR that generalizes the loop troubleshooting docs in the coredns loop plugin readme, so it more clearly applies to any kind of local DNS caching server, not just systemd-resolved. coredns/coredns#2363 |
@aravind-murthy have you retried this ? |
@alejandrox1 Not yet please give me some time, i need the k8s cluster up and running to test jenkins-k8s-kubectl plugin (unrelated proof-of-concept to this issue). I will report back when I can. |
looks like we already have an entry for this in the TS guide here: |
@aravind-murthy please feel free to report your findings, still. |
Hi @alejandrox1 , @neolit123 - i dont know whether its the combination of the new kubernetes v1.13 (when i raised this ticket i used v1.12 because that was the version available at that time) or me following the instructions (properly this time), but, on Ubuntu 16.04.05, installing latest kubernetes v1.13, following https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#tabs-pod-install-4 (i.e. Flannel) and then setting net.bridge.bridge-nf-call-iptables to 1, using the command No more crashloopback errors for coredns!!! :-) :-) Thank you so much |
disabling |
I had this same issue after deleting the loop. Can someone help me with this? kubectl logs coredns-fb8b8dccf-j6mjl -n kube-system |
@Ramane19, your pods are stuck in "ContainerCreating", which is a different issue. |
Before it showing pending, after I deleted the loop it showed creating container? Is there any other way i can resolve this issue? |
I did, it talks about the looping issue, not the creation of the containers
|
OK, then it seems to be unrelated to this issue - i.e. off topic. |
This worked for me |
What keywords did you search in kubeadm issues before filing this one?
Ubuntu 16.04 coredns crashloopbackoff
Is this a BUG REPORT or FEATURE REQUEST?
BUG REPORT
Versions
kubeadm version (use
kubeadm version
):Environment:
kubectl version
):Cloud provider or hardware configuration:
Local Virtual Machine, 2 CPU, 4 GB RAM
OS (e.g. from /etc/os-release):
uname -a
):What happened?
What you expected to happen?
I expected coredns pods to start properly
How to reproduce it (as minimally and precisely as possible)?
https://kubernetes.io/docs/setup/cri/#docker
https://kubernetes.io/docs/setup/independent/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl
kubeadm init --pod-network-cidr=10.244.0.0/16
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network & scroll down to section "Installing a pod network add-on" & select tab "Flannel"
root@k8s-master:~# kubectl get pods --all-namespaces
Anything else we need to know?
"Hack" solution mentioned in https://stackoverflow.com/a/53414041/5731350 works but i am not comfortable disabling something (loop) that is supposed to be working-as-intended.
The text was updated successfully, but these errors were encountered: