Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fedora VM Drivers on minikube 1.33 cannot pull images (resolved.conf) #18705

Closed
llegolas opened this issue Apr 20, 2024 · 44 comments · Fixed by #18830
Closed

Fedora VM Drivers on minikube 1.33 cannot pull images (resolved.conf) #18705

llegolas opened this issue Apr 20, 2024 · 44 comments · Fixed by #18830
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@llegolas
Copy link

What Happened?

minikube fails to start network cni (resolved.conf)

Attach the log file

It seem as a problem with systemd-resolved.service
running

resolvectl query quay.io

times out with the following in the logs:

Apr 20 21:41:50 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 20 21:41:50 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 20 21:41:50 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 20 21:41:50 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 20 21:43:46 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 20 21:43:46 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 20 21:43:46 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 20 21:43:46 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 20 21:45:41 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 20 21:45:41 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 20 21:45:41 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 20 21:45:41 minikube systemd-resolved[202]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary

if I run

resolvectl query --validate=no quay.io

all is good

Operating System

Redhat/Fedora

Driver

KVM2

@llegolas llegolas changed the title minikibe 1.33 cannot pull images minikube 1.33 cannot pull images (resolved.conf) Apr 20, 2024
@cirix
Copy link

cirix commented Apr 21, 2024

I confirm the same on my MacBook M2 2023. Same behavior. The minikube addons fails. ingress and ingress-dns clearly also stating not supported(compiled image missing). I have submitted a fix for ingress-dns but they decided ignore.

@llegolas
Copy link
Author

llegolas commented Apr 21, 2024

not sure how /etc/systemd/resolved.conf with DENSSEC=yes crept in but it is glaring difference with the previous version of 1.32, and I'm unaware of any major distro shipping with DNSSEC on by default as this is still marked as experimental in systemd upstream AFAIK.

@medyagh
Copy link
Member

medyagh commented Apr 22, 2024

@llegolas can you paste the full logs
you can attach the file to this issue
minikube logs --file=logs.txt

, does that only affect when you enable a specific addon or only on quay.io images?
does the minikube start works on default case?

and also can you try with a different container runtime to see if that solves it ?

minikube delete --all
minikube start --container-runtime=containerd

@medyagh
Copy link
Member

medyagh commented Apr 22, 2024

btw I just tired this myself and it seems good for me for Qemu Driver on M1 (arm64)

$ minikube ssh
$ resolvectl query quay.io
quay.io: 2600:1f18:483:cf00:3b4b:fac4:4515:599b -- link: eth0
         2600:1f18:483:cf00:e91a:d8cb:1b23:812b -- link: eth0
         2600:1f18:483:cf02:3f6c:1ed8:799c:1a49 -- link: eth0
         2600:1f18:483:cf01:455b:c2ec:5fad:7f15 -- link: eth0
         2600:1f18:483:cf02:3b2:c8d9:2c85:b678 -- link: eth0
         2600:1f18:483:cf01:cd0c:370b:1c0c:7317 -- link: eth0
         3.231.43.138                          -- link: eth0
         3.230.69.80                           -- link: eth0
         54.243.3.120                          -- link: eth0
         3.228.243.121                         -- link: eth0
         3.220.0.108                           -- link: eth0
         44.212.212.54                         -- link: eth0

-- Information acquired via protocol DNS in 26.2ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network
$ resolvectl query docker.io
docker.io: 2600:1f18:2148:bc00:a81:7e44:4669:3426 -- link: eth0
           2600:1f18:2148:bc01:1983:8fd2:2dfc:a04c -- link: eth0
           2600:1f18:2148:bc02:a090:6a5b:b2ff:3152 -- link: eth0
           54.156.140.159                      -- link: eth0
           52.44.227.212                       -- link: eth0
           44.221.37.199                       -- link: eth0

-- Information acquired via protocol DNS in 25.6ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network

@medyagh
Copy link
Member

medyagh commented Apr 22, 2024

btw I do not see DENSSEC=yes in my resolved.conf

$ 
$ 
$ cat  /etc/systemd/resolved.conf
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it under the
#  terms of the GNU Lesser General Public License as published by the Free
#  Software Foundation; either version 2.1 of the License, or (at your option)
#  any later version.
#
# Entries in this file show the compile time defaults. Local configuration
# should be created by either modifying this file, or by creating "drop-ins" in
# the resolved.conf.d/ subdirectory. The latter is generally recommended.
# Defaults can be restored by simply deleting this file and all drop-ins.
#
# Use 'systemd-analyze cat-config systemd/resolved.conf' to display the full config.
#
# See resolved.conf(5) for details.

[Resolve]
# Some examples of DNS servers which may be used for DNS= and FallbackDNS=:
# Cloudflare: 1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflare-dns.com 2606:4700:4700::1111#cloudflare-dns.com 2606:4700:4700::1001#cloudflare-dns.com
# Google:     8.8.8.8#dns.google 8.8.4.4#dns.google 2001:4860:4860::8888#dns.google 2001:4860:4860::8844#dns.google
# Quad9:      9.9.9.9#dns.quad9.net 149.112.112.112#dns.quad9.net 2620:fe::fe#dns.quad9.net 2620:fe::9#dns.quad9.net
#DNS=
#FallbackDNS=1.1.1.1#cloudflare-dns.com 8.8.8.8#dns.google 1.0.0.1#cloudflare-dns.com 8.8.4.4#dns.google 2606:4700:4700::1111#cloudflare-dns.com 2001:4860:4860::8888#dns.google 2606:4700:4700::1001#cloudflare-dns.com 2001:4860:4860::8844#dns.google
#Domains=
#DNSSEC=allow-downgrade
#DNSOverTLS=opportunistic
#MulticastDNS=yes
#LLMNR=yes
#Cache=yes
#CacheFromLocalhost=no
#DNSStubListener=yes
#DNSStubListenerExtra=
#ReadEtcHosts=yes
#ResolveUnicastSingleLabel=no

cat /version.json
{"iso_version": "v1.33.0", "kicbase_version": "v0.0.43-1713236840-18649", "minikube_version": "v1.33.0", "commit": "4bd203f0c710e7fdd30539846cf2bc6624a2556d"}

llegolas @cirix
do you mind running this command to verify the ISO version you are using.
"cat /version.json"

@llegolas
Copy link
Author

llegolas commented Apr 22, 2024

#DNSSEC=allow-downgrade you see there is the assumed default meaning it is on but "should"downgrade if no signature is provided. Well for some reason it is not downgrading.
I've tested both with containerd and crio runtimes. And it was the cilium CNI that was failing to start because of the DNSSEC problem. I've reverted back to 1.32 as I had something urgent to test. Later tonight I'll provide the logs when I upgrade to 1.33 again.

@cirix
Copy link

cirix commented Apr 22, 2024

I had to work on something and downgraded. In my case in the image you will find the bootstrap on my macosx and the logs
@medyagh here you go. You have also a pending action with my fix on the upgrade of ingress-dns...being waiting for more than 2 months.

Screenshot 2024-04-22 at 21 30 35
logs.txt

Screenshot 2024-04-22 at 21 43 33

@llegolas
Copy link
Author

here is the whole session:

$ minikube delete
🔥  Deleting "minikube" in kvm2 ...
💀  Removed all traces of the "minikube" cluster.
$ sudo dnf remove minikube
$ sudo dnf install ~/Downloads/minikube-1.33.0-0.x86_64.rpm
$ minikube start --cni=cilium  --container-runtime=containerd --cpus=4 --memory=16g --addons=ingress-dns,ingress --driver=kvm2
😄  minikube v1.33.0 on Fedora 39
✨  Using the kvm2 driver based on user configuration
💿  Downloading VM boot image ...
    > minikube-v1.33.0-amd64.iso....:  65 B / 65 B [---------] 100.00% ? p/s 0s
    > minikube-v1.33.0-amd64.iso:  314.16 MiB / 314.16 MiB  100.00% 7.02 MiB p/
👍  Starting "minikube" primary control-plane node in "minikube" cluster
💾  Downloading Kubernetes v1.30.0 preload ...
    > preloaded-images-k8s-v18-v1...:  375.69 MiB / 375.69 MiB  100.00% 5.97 Mi
🔥  Creating kvm2 VM (CPUs=4, Memory=16384MB, Disk=20000MB) ...
❗  This VM is having trouble accessing https://registry.k8s.io
💡  To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/
📦  Preparing Kubernetes v1.30.0 on containerd 1.7.15 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring Cilium (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/minikube-ingress-dns:0.0.2
    ▪ Using image registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.4.0
    ▪ Using image registry.k8s.io/ingress-nginx/controller:v1.10.0
    ▪ Using image registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.4.0
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🔎  Verifying ingress addon...
❗  Enabling 'ingress' returned an error: running callbacks: [waiting for app.kubernetes.io/name=ingress-nginx pods: context deadline exceeded]
🌟  Enabled addons: ingress-dns, storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
$ k get pods -A
NAMESPACE       NAME                                       READY   STATUS                  RESTARTS   AGE
ingress-nginx   ingress-nginx-admission-create-cp9td       0/1     Pending                 0          7m4s
ingress-nginx   ingress-nginx-admission-patch-4fcb8        0/1     Pending                 0          7m4s
ingress-nginx   ingress-nginx-controller-84df5799c-2mzqm   0/1     Pending                 0          7m4s
kube-system     cilium-operator-86f4c5579c-xxsrn           0/1     ImagePullBackOff        0          7m4s
kube-system     cilium-s7nlr                               0/1     Init:ImagePullBackOff   0          7m4s
kube-system     coredns-7db6d8ff4d-jql96                   0/1     Pending                 0          7m4s
kube-system     etcd-minikube                              1/1     Running                 0          7m17s
kube-system     kube-apiserver-minikube                    1/1     Running                 0          7m19s
kube-system     kube-controller-manager-minikube           1/1     Running                 0          7m17s
kube-system     kube-ingress-dns-minikube                  0/1     Pending                 0          7m14s
kube-system     kube-proxy-m2l8p                           1/1     Running                 0          7m4s
kube-system     kube-scheduler-minikube                    1/1     Running                 0          7m17s
kube-system     storage-provisioner                        0/1     Pending                 0          7m14s
$ minikube ssh
                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ sudo -i
# cat /version.json 
{"iso_version": "v1.33.0", "kicbase_version": "v0.0.43-1713236840-18649", "minikube_version": "v1.33.0", "commit": "4bd203f0c710e7fdd30539846cf2bc6624a2556d"}
# systemctl status systemd-resolved.service
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enable
     Active: active (running) since Mon 2024-04-22 19:27:01 UTC; 5min ago
       Docs: man:systemd-resolved.service(8)
             man:org.freedesktop.resolve1(5)
             https://www.freedesktop.org/wiki/Software/systemd/writing-network-configuration-m
             https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
   Main PID: 191 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 18837)
     Memory: 1.4M
        CPU: 295ms
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
     Active: active (running) since Mon 2024-04-22 19:27:01 UTC; 9min ago
       Docs: man:systemd-resolved.service(8)
             man:org.freedesktop.resolve1(5)
             https://www.freedesktop.org/wiki/Software/systemd/writing-network-configuration-managers
             https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
   Main PID: 191 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 18837)
     Memory: 1.4M
        CPU: 327ms
     CGroup: /system.slice/systemd-resolved.service
             └─191 /usr/lib/systemd/systemd-resolved

Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
     Active: active (running) since Mon 2024-04-22 19:27:01 UTC; 11min ago
       Docs: man:systemd-resolved.service(8)
             man:org.freedesktop.resolve1(5)
             https://www.freedesktop.org/wiki/Software/systemd/writing-network-configuration-managers
             https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
   Main PID: 191 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 18837)
     Memory: 1.4M
        CPU: 341ms
     CGroup: /system.slice/systemd-resolved.service
             └─191 /usr/lib/systemd/systemd-resolved

Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
● systemd-resolved.service - Network Name Resolution
     Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; preset: enabled)
     Active: active (running) since Mon 2024-04-22 19:27:01 UTC; 11min ago
       Docs: man:systemd-resolved.service(8)
             man:org.freedesktop.resolve1(5)
             https://www.freedesktop.org/wiki/Software/systemd/writing-network-configuration-managers
             https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
   Main PID: 191 (systemd-resolve)
     Status: "Processing requests..."
      Tasks: 1 (limit: 18837)
     Memory: 1.4M
        CPU: 344ms
     CGroup: /system.slice/systemd-resolved.service
             └─191 /usr/lib/systemd/systemd-resolved

Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary

~
[logs.txt](https://github.com/kubernetes/minikube/files/15067876/logs.txt)

logs are attached

@llegolas
Copy link
Author

the full systemd-resolved.service logs:

# journalctl -u  systemd-resolved.service|tee
Apr 22 19:27:01 minikube systemd[1]: Starting Network Name Resolution...
Apr 22 19:27:01 minikube systemd-resolved[191]: Positive Trust Anchors:
Apr 22 19:27:01 minikube systemd-resolved[191]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Apr 22 19:27:01 minikube systemd-resolved[191]: Negative trust anchors: home.arpa 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test
Apr 22 19:27:01 minikube systemd-resolved[191]: Using system hostname 'minikube'.
Apr 22 19:27:01 minikube systemd[1]: Started Network Name Resolution.
Apr 22 19:27:04 minikube systemd-resolved[191]: Switching to fallback DNS server 1.1.1.1#cloudflare-dns.com.
Apr 22 19:27:04 minikube systemd-resolved[191]: Using degraded feature set UDP+EDNS0+DO instead of TLS+EDNS0+DO for DNS server 192.168.122.1.
Apr 22 19:27:21 minikube systemd-resolved[191]: Failed to send hostname reply: Transport endpoint is not connected
Apr 22 19:30:12 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:30:12 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:30:12 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:30:12 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:30:12 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:32:08 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:32:08 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:32:08 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:32:08 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:34:07 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:36:06 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:38:35 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:38:35 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:38:35 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:38:35 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:38:35 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:42:15 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:42:15 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:42:15 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
Apr 22 19:42:15 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:46:31 minikube systemd-resolved[191]: Grace period over, resuming full feature set (TLS+EDNS0+DO) for DNS server 192.168.122.1.
Apr 22 19:46:31 minikube systemd-resolved[191]: Using degraded feature set UDP+EDNS0+DO instead of TLS+EDNS0+DO for DNS server 192.168.122.1.
Apr 22 19:48:30 minikube systemd-resolved[191]: DNSSEC validation failed for question io IN SOA: failed-auxiliary
Apr 22 19:48:30 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN DS: failed-auxiliary
Apr 22 19:48:30 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN SOA: failed-auxiliary
Apr 22 19:48:30 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN AAAA: failed-auxiliary
Apr 22 19:48:30 minikube systemd-resolved[191]: DNSSEC validation failed for question quay.io IN A: failed-auxiliary
# 

@medyagh
Copy link
Member

medyagh commented Apr 22, 2024

@llegolas and @cirix
for both of you I see "minikube start" logs detected an abnormality in your network access to registry and warns that
`` This VM is having trouble accessing https://registry.k8s.io```

**Question1 :**When you downgrade to 1.32 does minikube start still give you the same warning ?

Question2:
Can you plz share, the output of bellow command (in the same terminal as minikube)

curl -sS -m 2 https://registry.k8s.io/

(this runs the same check that minikube does to check access to registry.k8s.io)

@llegolas and @cirix are you both on a corp network with a custom DNS? or do you need to set a Proxy to access registries ? (if yes then you would need to set the HTTP_PROXY envs) in the same terminal
#15021

in our website we have a known issue for Builtin Network (not socket_vm network)
our site suggests https://minikube.sigs.k8s.io/docs/drivers/qemu/:

If possible, reorder your /etc/resolv.conf to have a general nameserver entry first (eg. 8.8.8.8) and reboot your machine.
Use --network=socket_vmnet

does that help you for socket vmnet too ?

btw for reference,
For me on qemu on macos M1 I do not get any error from "systemd-resolved"

$ journalctl -u  systemd-resolved.service|tee
Apr 22 18:53:08 minikube systemd[1]: Starting Network Name Resolution...
Apr 22 18:53:08 minikube systemd-resolved[170]: Positive Trust Anchors:
Apr 22 18:53:08 minikube systemd-resolved[170]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Apr 22 18:53:08 minikube systemd-resolved[170]: Negative trust anchors: home.arpa 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test
Apr 22 18:53:12 minikube systemd-resolved[170]: Using system hostname 'minikube'.
Apr 22 18:53:12 minikube systemd[1]: Started Network Name Resolution.
Apr 22 18:53:18 minikube systemd-resolved[170]: Using degraded feature set UDP+EDNS0+DO instead of TLS+EDNS0+DO for DNS server 192.168.105.1.
Apr 22 18:53:18 minikube systemd-resolved[170]: Using degraded feature set UDP+EDNS0 instead of UDP+EDNS0+DO for DNS server 192.168.105.1.
Apr 22 18:53:18 minikube systemd-resolved[170]: Server 192.168.105.1 does not support DNSSEC, downgrading to non-DNSSEC mode.
Apr 22 18:53:48 minikube systemd-resolved[170]: Clock change detected. Flushing caches.
Apr 22 19:01:12 minikube systemd-resolved[170]: Grace period over, resuming full feature set (TLS+EDNS0+DO) for DNS server 192.168.105.1.
Apr 22 19:01:12 minikube systemd-resolved[170]: Using degraded feature set UDP+EDNS0+DO instead of TLS+EDNS0+DO for DNS server 192.168.105.1.
Apr 22 19:01:12 minikube systemd-resolved[170]: Using degraded feature set UDP+EDNS0 instead of UDP+EDNS0+DO for DNS server 192.168.105.1.
$ journalctl -u  systemd-resolved.service|tee | grep fail
$ 

@medyagh medyagh added the triage/needs-information Indicates an issue needs more information in order to work on it. label Apr 22, 2024
@medyagh
Copy link
Member

medyagh commented Apr 22, 2024

@cirix @llegolas

can you the try ISO before this PR ? https://github.com/kubernetes/minikube/pull/18277/files
you can try using the different iso using iso-url:

minikube delete --all
minikube start --iso-url=https://storage.googleapis.com/minikube-builds/iso/17991/minikube-v1.32.1-1710520390-17991-arm64.iso

Other PRS to try:
we could try the ISO PRs one by one and see which one fixes your issue
#18277
#17991
#18213
#17206

@llegolas
Copy link
Author

llegolas commented Apr 23, 2024

  1. with 1.32 I' m almost positive I did not see registry.k8s.io warning/error

  2. here is what curl gives:

# curl -sS -m 2 https://registry.k8s.io
curl: (28) Resolving timed out after 2000 milliseconds
# 

as expected by -m 2 it takes just 2 seconds to fail the DNS resolution.

  1. Not on corp network no need of proxy.

Settiing the DNS on eth1 to 8.8.8.8 instead of 192.168.122.1 solves the problem. Disabling the DNS on the 'default' kvm network used for eth1 solves the problem too.

It looks like dnsmasq used by libvirt for dns and dhcp is not having dnssec enabled by default.

I'm using driver kvm2 not qemu so network=mvnet_socket is not applicable

With all that in mind modifying the 'default' libvirt network (disable dns altogether) just to accommodate minikube seem unacceptable provided the previous version 1.32 which had DNSSEC=no and not DNSSEC=allow-downgrade was working OK.

I see no point of trying with the ISOs from the prs above as just one of them is x86 related but just introduces dm-multipath.

By the way cilium CNI fails to start after all but it is another issue "FailedPostStartHook"

@cirix
Copy link

cirix commented Apr 23, 2024

I think is not relevant to the architecture. I enable on my macbook again the network for my qemu2 and I have the same problem. I did the same I reconfigured the DNS manually as @llegolas and it worked and boils down to what is reported. The behavior of the image has changed.
Currently I am building locally also a new image to test with the latest from master to see if I can reproduce.

br
Nik

@llegolas
Copy link
Author

with some git archeology I traced the behavior change to aeed263 . The previous version of buildroot had DNSSEC=nobuildroot/buildroot@b16ae93

@medyagh
Copy link
Member

medyagh commented Apr 23, 2024

Thank you for providing more info, i agree this should not happen to you and we need to find a solution that can help most users without lowering security standards.

I would like to know what is special about your setup or workstation and why does it time out, and why it doesnt happen to me? Does it affect you only it you deploy an image from quay.io ?

curl -sS -m 2 https://registry.k8s.io

curl: (28) Resolving timed out after 2000 milliseconds
#18732

Since that commands run on your computer and not on minikube.

@cirix the command I provided above —iso-url lets you try minikube with different ISO (would need to delete previous one first)

And you can get the ISO build number from the PR, that way you dont have to wait to
Build ISO.

@llegolas
Copy link
Author

llegolas commented Apr 23, 2024

@medyagh the curl command is run inside the minikube VM!
There is nothing special about my setup. It is the same setup minikube 1.32 was running just fine.
Perhaps you can take your time and read thoroughly my comments #18705 (comment) and #18705 (comment) esp the parts:

"It looks like dnsmasq used by libvirt for dns and dhcp is not having dnssec enabled by default."
and

"With all that in mind modifying the 'default' libvirt network (disable dns altogether) just to accommodate minikube seem unacceptable provided the previous version 1.32 which had DNSSEC=no and not DNSSEC=allow-downgrade was working OK."

And to repeat - why testing different PR builds when I've narrowed down the problem to aeed263 ??

@spowelljr
Copy link
Member

spowelljr commented Apr 23, 2024

Hi @llegolas, thank you for your investigation and discovering the cause. I think the best path going forward is I can add a patch to the ISO to revert buildroot/buildroot@b16ae93 and we can release a patch release to resolve this issue. Then we can work on a long term solution to the issue for the next minor release.

@llegolas
Copy link
Author

Happy to test the PR #18737 ISO once available

@spowelljr
Copy link
Member

Would you mind testing now @llegolas?

--iso-url="https://storage.googleapis.com/minikube-builds/iso/18737/minikube-v1.33.0-1713979993-18737-amd64.iso"

@llegolas
Copy link
Author

Unfortunately it does not solve the problem. I still have DNNSEC=allow-downgrade with all the consequences of it.
By the looks of this

 ifeq ($(BR2_PACKAGE_LIBGCRYPT),y)
 SYSTEMD_DEPENDENCIES += libgcrypt
-SYSTEMD_CONF_OPTS += -Dgcrypt=true
+SYSTEMD_CONF_OPTS += -Ddefault-dnssec=allow-downgrade -Dgcrypt=true
 else
-SYSTEMD_CONF_OPTS += -Dgcrypt=false
+SYSTEMD_CONF_OPTS += -Ddefault-dnssec=no -Dgcrypt=false
 endif

DNSSEC=allow-downgrade is enabled if libgcrypt is installed which it is.
Perhaps better solution is to drop /etc/systemd/resolved.conf with just

[Resolve]
DNSSEC=no

thus preserving all other default whatever they might be now or in the future.

@medyagh
Copy link
Member

medyagh commented Apr 25, 2024

Thank you llegolas for staying with us by trying and providing the logs, you were right the minikube check for regsitry connection happens inside minikube (that we should do outside as well if iit fails #18754)

I still have not been able to replicate your issue myself, I know you said you dont have any Proxy or Corp network, but is there any thing else that you could think of that is different with your environment ?

it would greatly help us if we find out whats is different about your environment than mine or the one we created in the cloud, that way we could guard against it in the code and help more users...

@llegolas
Copy link
Author

@medyagh lets start by comparing how we run it at first place.
I'm running it on stock fedora 40 laptop with KVM driver. I don't think it is exotic or uncommon.
Do you have the same setup?

In a broader sense to replicate it you should run in on linux host with KVM driver. There are too many subtle differences in the way the different hypervisors or container runtimes set the network for the VM/containers to rely on the assumption that this can be replicated across all of them.

@medyagh
Copy link
Member

medyagh commented Apr 26, 2024

@llegolas sure thing ! btw you did delete the old minikube before trying out the ISO in the PR ?
we could is basically keep trying the ISOs that from the above PRs and see which one broke it for you ...

@spowelljr tried it with KVM on linux (ubuntu) unfortunately we dont have fedoras, I really wish we had Fedora machines for integration tests, is there anything special about your DNS config ? is everything fully vanila fedora 40?

@cirix
Copy link

cirix commented Apr 26, 2024

I don't know if the problem is with the host per se, but more related to the hypervisor and the configuration of dns on the linux image of minikube. In my case on a Macosx M2 with qemu2 the logs on the minikube services is exactly the same as the ones with Fedora + KVM. Now, if I manipulate the bridge interface that gets created or the resolv conf to set DNSSEC off the delete and recreate of the pods make the problem go away. My intuition is that is related to the os+hyperv setup.
Downgrading to 1.32 solve the problem....
Adding some more meet to the discussion.
Use case: Started minikube 1.32 where DNSSEC no is available and actually is not mandatory. Screenshots from minikube ssh -> You can see logs of systemd-resolve service and what system-resolve reports back from the command system-resolve --status

Screenshot 2024-04-26 at 22 10 34
Screenshot 2024-04-26 at 22 19 20
You can clearly see that the underlying virtual interfaces do not support DNSSEC but then on the systemd-resolved service you can see that dnsssec trust anchors negative/positive are set thus allowing the relevant not validate secure dns traffic to passthrough the linked virtual interface

Now on the other hand, I have deployed version 33 marked the binary minikube 33 and you will see a total different log setup:

Screenshot 2024-04-26 at 22 27 01
Screenshot 2024-04-26 at 22 27 32

Comparing side by side the above , I think becomes evident that on the second scenario DNSSEC is killing the traffic. Hope the above help.

p.s. If I manipulate the dnssec myself and reconfigure dns on minikube33 the problem goes away, but I don't consider it a viable solution.
A hack was to edit the /etc/systemd/resolv.conf file and remove the comment on DNSSEC also. That's also what @llegolas describe as a "potential solution"
All above were done by performing always minikube delete --all to make sure nothing is cached

Also for osx based oses is clearly documented on Apple that DNSSEC is not supported and it can only happen by changing the dns server(i.e. use unbound). As you will see has not received any votes to be implemented:
https://discussions.apple.com/thread/2279415?sortBy=best

Overall: The linux iso used to start minikube has major differences between 32 and 33 image. Add to that the behavior of the underlying virtual interfaces varying between os versions and the hypervisor used is not that evident

regards
\n\b

nirs added a commit to nirs/ramen that referenced this issue May 6, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 6, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 6, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 6, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 6, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 6, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
@medyagh
Copy link
Member

medyagh commented May 7, 2024

$ grep DNSSEC /etc/systemd/resolved.conf
#DNSSEC=allow-downgrade
$


we still have that allow-downgrade
#DNSSEC=allow-downgrade

@llegolas isnt that commented ? it is a linux convention that lines that start with # are comments. They are not interpreted as configuration options by systemd-resolved

does still give you the pull image problem as well ?

nirs added a commit to nirs/ramen that referenced this issue May 7, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
@medyagh
Copy link
Member

medyagh commented May 7, 2024

Workaround

Create this in $MINIKUBE_HOME:

$ cat ~/.minikube/files/etc/systemd/resolved.conf.d/99-dnssec.conf 
[Resolve]
DNSSEC=no

New minikube vms will use this config.

Restart systemd-resolved after minikube starts:

minikube ssh -p {profile} 'sudo systemctl restart systemd-resolved.service'

Without restarting systemd-resolved, I got failures pulling flannel image:

kube-flannel   12m                     Warning   Failed                    Pod/kube-flannel-ds-9rbrg   Failed to pull image "docker.io/flannel/flannel-cni-plugin:v1.4.0-flannel1": failed to pull and unpack image "docker.io/flannel/flannel-cni-plugin:v1.4.0-flannel1": failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/77/77c1250c26d962f294c2b42f7bb4f3e0b31431995aef98c0699669447f170132/data?verify=1714999696-2UAbl43vsG%2FIZhsv8MYnwbCxNNA%3D": dial tcp: lookup production.cloudflare.docker.com: i/o timeout

thank you for pointing this out, this work arround worth a try

@medyagh medyagh changed the title minikube 1.33 cannot pull images (resolved.conf) VM Drivers on minikube 1.33 cannot pull images (resolved.conf) May 7, 2024
@spowelljr
Copy link
Member

Workaround

Create this in $MINIKUBE_HOME:

$ cat ~/.minikube/files/etc/systemd/resolved.conf.d/99-dnssec.conf 
[Resolve]
DNSSEC=no

New minikube vms will use this config.

Restart systemd-resolved after minikube starts:

minikube ssh -p {profile} 'sudo systemctl restart systemd-resolved.service'

Without restarting systemd-resolved, I got failures pulling flannel image:

kube-flannel   12m                     Warning   Failed                    Pod/kube-flannel-ds-9rbrg   Failed to pull image "docker.io/flannel/flannel-cni-plugin:v1.4.0-flannel1": failed to pull and unpack image "docker.io/flannel/flannel-cni-plugin:v1.4.0-flannel1": failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/77/77c1250c26d962f294c2b42f7bb4f3e0b31431995aef98c0699669447f170132/data?verify=1714999696-2UAbl43vsG%2FIZhsv8MYnwbCxNNA%3D": dial tcp: lookup production.cloudflare.docker.com: i/o timeout

Made a PR to implment this: #18830

@llegolas
Copy link
Author

llegolas commented May 7, 2024

$ grep DNSSEC /etc/systemd/resolved.conf
#DNSSEC=allow-downgrade
$


we still have that allow-downgrade
#DNSSEC=allow-downgrade

@llegolas isnt that commented ? it is a linux convention that lines that start with # are comments. They are not interpreted as configuration options by systemd-resolved

does still give you the pull image problem as well ?

It is commented out indeed but is is also linux convention that the commented out lines document the defaults the application in this case systemd-resolved uses. I've explained that two weeks ago here #18705 (comment)

@spowelljr will test the new ISO once the build is complete. Please drop me a link

@nirs
Copy link
Contributor

nirs commented May 8, 2024

I tested the ISO, it looks good:
#18830 (comment)

@llegolas
Copy link
Author

llegolas commented May 8, 2024

#18830 ISO image fixes it --iso-url https://storage.googleapis.com/minikube-builds/iso/18830/minikube-v1.33.0-1715106791-18830-amd64.iso

raghavendra-talur pushed a commit to RamenDR/ramen that referenced this issue May 8, 2024
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].

A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).

Now that we setup systemd-resolved configuration we can upgrade to
minikube 1.33.0, and Alex can run drenv kubevirt environment without
manual hacks.

This change:
- Updates the docs that latest minikube test is 1.33.0
- Setup minikube by default when creating the virtual environment, so
  minikube 1.33.0 support is added transparently for developers
- Setup drenv in the e2e job to support minikube 1.33.0
- Cleanup drenv in the e2e job so setup from previous build will not
  leak into the next build

[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237

Thanks: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
@medyagh medyagh changed the title VM Drivers on minikube 1.33 cannot pull images (resolved.conf) Fedora VM Drivers on minikube 1.33 cannot pull images (resolved.conf) May 8, 2024
@medyagh
Copy link
Member

medyagh commented May 8, 2024

thank you @nirs and also @llegolas for assisting in trying out the fix, btw if either if you have a way to provide minikube with Fedora Testing infrastructures it would be nice to run minikube functional tests on fedora as well

@medyagh
Copy link
Member

medyagh commented May 8, 2024

to prove a point I just tried the same scenario with freshly installed ubuntu 23.04 as a host + minikube 1.33 = exactly the same problems. Not sure what ubuntus you have and test on but for sure they are not VANILLA 23.04.

@llegolas do you mind confirming one more time that this issue did happen to you for ubuntu ?

I have had 3 different people try on ubuntu and none of them could reproduce this issue, it would be nice if we can confirm this is only related to Fedora so we could have some sort of Pre-Detection Mechasim for this

@prezha
Copy link
Collaborator

prezha commented May 8, 2024

i think the originally reported issue is fedora-specific having systemd-resolved enabled by default

to test, i created a fedora 39 vm and successfully replicated with minikube v1.33.0

to confirm this, can you please try the following:

  • on your host (fedora 39), replace nameserver 127.0.0.53 with nameserver 8.8.8.8 in your /etc/resolv.conf, then
  • on your minikube vm, run resolvectl query quay.io or resolvectl query quay.io
    and let us know if that works

note: although i tested various things, i think that the above are minimal (but not proper nor permanent) changes needed to be made to fedora host only to make dns resolution work in minikube vm (without any additional changes needed in minikube)

before

at host:

$ cat /etc/resolv.conf
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
...
nameserver 127.0.0.53
options edns0 trust-ad
search .
$ minikube-linux-amd64 start
😄  minikube v1.33.0 on Fedora 39 (kvm/amd64)
✨  Using the kvm2 driver based on existing profile
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🔄  Restarting existing kvm2 VM for "minikube" ...
❗  This VM is having trouble accessing https://registry.k8s.io
💡  To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/
🐳  Preparing Kubernetes v1.30.0 on Docker 26.0.1 ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
💡  kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

at vm:

$ resolvectl query docker.io
docker.io: resolve call failed: DNSSEC validation failed: failed-auxiliary

after

at host:

$ cat /etc/resolv.conf
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
...
nameserver 8.8.8.8
options edns0 trust-ad
search .
$ minikube-linux-amd64 stop
✋  Stopping node "minikube"  ...
🛑  1 node stopped.
$ minikube-linux-amd64 start
😄  minikube v1.33.0 on Fedora 39 (kvm/amd64)
✨  Using the kvm2 driver based on existing profile
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🔄  Restarting existing kvm2 VM for "minikube" ...
🐳  Preparing Kubernetes v1.30.0 on Docker 26.0.1 ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: default-storageclass, storage-provisioner
💡  kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

at vm:

$ resolvectl query docker.io
docker.io: 2600:1f18:2148:bc01:1983:8fd2:2dfc:a04c -- link: eth1
           2600:1f18:2148:bc00:a81:7e44:4669:3426 -- link: eth1
           2600:1f18:2148:bc02:a090:6a5b:b2ff:3152 -- link: eth1
           54.156.140.159                      -- link: eth1
           52.44.227.212                       -- link: eth1
           44.221.37.199                       -- link: eth1

-- Information acquired via protocol DNS in 1min 29.1577s.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network

@llegolas
Copy link
Author

llegolas commented May 9, 2024

to prove a point I just tried the same scenario with freshly installed ubuntu 23.04 as a host + minikube 1.33 = exactly the same problems. Not sure what ubuntus you have and test on but for sure they are not VANILLA 23.04.

@llegolas do you mind confirming one more time that this issue did happen to you for ubuntu ?

I have had 3 different people try on ubuntu and none of them could reproduce this issue, it would be nice if we can confirm this is only related to Fedora so we could have some sort of Pre-Detection Mechasim for this

The setup I had for the ubuntu test was: host machine fedora 40(39 works all the same) ubuntu 23.04 VM inside the fedora and minikube inside the ubuntu. Sort of fedora(ubuntu(minikube)).

@spowelljr
Copy link
Member

Fix is included in latest release of minikube: https://github.com/kubernetes/minikube/releases/tag/v1.33.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
6 participants