New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod not able to connect to url login.microsoftonline.com:443 #3604
Comments
Most of the questions in the bug template are missing in your post and you don't specify where / how you're executing nslookup but DNS within the cluster is either:
|
Thank you for your response @BenTheElder. My fault if I didn't fill out all the information. The command I ran to get into a pod to troubleshoot was: kubectl run troubleshooting --rm -i --tty --image nicolaka/netshoot -- /bin/bash Once in the pod, I ran I also tried docker run --rm -it nicolaka/netshoot /bin/bash Once in the container, I did I also did Once in the node container, I did apt update
apt install dnsutils
nslookup login.microsoftonline.com Once again, inside the node itself I was able to do a It appears the problem only occurs when using the default name server for the pod. |
This would need to be with The internal nameserver is coreDNS with configurable search paths, if your host has a lot of search paths you might try this config option Line 159 in 5d17676
|
I shouldn’t have to do anything special, |
the DNS resolvers shouldn't see "www" vs "login" differently
Er, I'm not saying special to this particular domain, I'm saying your host environment may be resulting in a large number of search paths which can create flakiness. You can also test this by testing for |
I agree, it shouldn't.
It's a plain deployment with 1 control-plane and 5 or 6 workers, the only pod is the test pod created with I haven't even added Secrets or ConfigMaps. |
I believe the point wasn't about the kind pod configuration, rather the host environment it is running in. So your dev machine, the network you are connected to, any proxy relays, etc. |
Right, the behavior of the DNS is related to the host machine's network, the docker install, the resolver configuration on the host, etc. I've made some suggestions about how to test for excluding some of these (namely the search paths). If I run the same pod in kind on my host then resolving the domain works fine. |
This command Works on my machine in a docker container and not in a docker container, it works on vms on my machine, it works on multiple machines in my network. All machines all use the same DNS server, which is 9.9.9.9. I also have a very basic home network. I even connected my computer straight to my cable modem, also connected my computer to my cell phone, all having the same behavior, which works just fine, except when using the pods default DNS Server. It even works in the K8s pod if I use So I really don’t get why you believe it’s my computer/network. |
I appreciate your suggestions but not sure how they worked Your suggestions
docker run -it --rm --net=kind nicolaka/netshoot bash The I tried
This also didn't work. Then again, I am not even sure if I'm even doing the I also tried everything this morning on a different computer all with the same result. |
I also tried doing the same test with |
This is definitely a kubernetes configuration issue and not anything kind is doing. What that configuration issue is though, still not clear. :) That is indeed not correct for Rather than running a container to test this, it would be good to run a pod within the context of the k8s cluster to make sure it's getting the same environment. This page has some useful tips for troubleshooting DNS issues: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ |
Because it doesn't replicate on my computer/network and it's a networking issue. It's not an inherent coreDNS behavior.
This isn't a valid dnsSearch setting, my suggestion was to configure an empty list instead to avoid passing in any from the host. DNS search is the list of suffixes attempted for domains that are not fully qualified. The better test mentioned here #3604 (comment) is to avoid search paths entirely by using a fully qualified domain name and see if that works. https://en.wikipedia.org/wiki/Fully_qualified_domain_name so try Also, if by any chance you're using alpine / muslc for your application image, consider using a glibc based base image (e.g. debian), there have historically been DNS issues with muslc's resolver in Kubernetes clusters which is not specific to kind. |
At first, I thought it was a I believe the evidence is showing its unlikely just a me issue, since I have been able to recreate this issue on multiple computers on my network and outside my network. With that being said, until the issue is known, it could be a configuration issue on my network.
So, I did everything in the post with no issues. So running kubectl exec -ti dnsutils -- cat /etc/resolv.conf gave me: search default.svc.cluster.local svc.cluster.local cluster.local 9.9.9.9
nameserver 10.96.0.10
options ndots:5
I even ran: kubectl exec -i -t dnsutils -- nslookup login.microsoftonline.com and it worked as expected. I will continue to do more debugging. |
After looking over the post a second time that @stmcginnis mentioned, I thought I would look at the I went into the pod as I did before with kubectl run troubleshooting --rm -i --tty --image nicolaka/netshoot -- /bin/bash and then I went to view the logs for kubectl logs --namespace=kube-system -l k8s-app=kube-dns Which shows a DNS error, shown below: .:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2
[ERROR] plugin/errors: 2 login.microsoftonline.com. A: dns: overflow unpacking uint32
.:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2 So now I have an error to work with! |
coredns/coredns#3305 suggests that this is a bug in the upstream DNS server coredns/coredns#3305 (comment) |
This seems to be somewhat environment dependent, based on UDP packet size? In any case, we can track it here but it doesn't look like there are good options for the kind project to directly affect this. We could force DNS over TCP but that's going to be a breaking change, it looks like you could enable this as a workaround like coredns/coredns#3305 (comment) |
Sometime this weekend, I may try to use Either way, in my opinion, |
I think there's a workaround available in https://github.com/coredns/coredns/pull/6277/commits but it's not in a ralease consumable by kuebadm yet because coredns/coredns#6661 ? If we can help sort that out, we can get it into a future kubernetes release and then into kind. |
I will check this out. I got cough up doing having to handle something this weekend so was not able to troubleshoot. I will look over what you sent. |
if you can upload the pcap that should be useful, it seems the issue is because a malformed answer that should be visible in the pcap |
@aojea I think coreDNS already has a mitigation merged, but there is not a released container image so kubeadm cannot upgrade. context above. When coreDNS can release an image for the current tag, we need to get the image mirrored into registry.k8s.io, upgrade kubeadm, and then kind can ship a patched image. In the meantime your options are somewhat limited unfortunately, you could patch the coreDNS deployment to your own coreDNS image with the new code maybe. |
kind version: kind v0.22.0 go1.20.13 windows/amd64
docker version: 26.1.1
OS: Windows 11 23H2
Kubernetes version:
My application is attempting to do OAuth when trying to get secrets from Azure Key Vault on startup. However, it is not able to connect to URL
login.microsoftonline.com:443
, so it crashes and goes into aCrashLoopBackOff
loop. Eventually, it will be able to connect after manyCrashLoopBackOff
attempts.The text was updated successfully, but these errors were encountered: