You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When deploying Kubernetes using Kubespray with OpenStack as the external cloud provider, the cloud provider initialization fails with the following error: W0212 09:05:21.997886 1 openstack.go:173] New openstack client created failed with config: Post "https://<redacted>:5000/v3/auth/tokens": dial tcp: lookup <redacted> on 10.233.0.3:53: write udp 10.233.0.3:48927->10.233.0.3:53: write: operation not permitted F0212 09:05:21.998071 1 main.go:84] Cloud provider could not be initialized: could not init cloud provider "openstack": Post "https://<redacted>:5000/v3/auth/tokens": dial tcp: lookup <redacted> on 10.233.0.3:53: write udp 10.233.0.3:48927->10.233.0.3:53: write: operation not permitted
This issue appears to be related to DNS resolution failures when the OpenStack cloud provider attempts to authenticate with the OpenStack API. The problem is linked to the DNS policy configuration introduced in commit c440106 (link to the commit) in the external-openstack-cloud-controller-manager-ds.yml.j2 template (direct link to the affected line). The CoreDNS pod cannot start due to the node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule taint, which in turn causes the OpenStack cloud controller to fail initialization.
Notably, this DNS policy setting does not align with the default configurations provided by the official OpenStack cloud provider repository, both in the Helm chart (link to chart) and the plain manifests (link to plain manifest).
What did you expect to happen?
I expected the OpenStack cloud provider to initialize successfully without DNS resolution issues. The official configurations from the OpenStack cloud provider repository do not specify a DNSpolicy, allowing pods to inherit DNS settings from the host, which seems to avoid such initialization problems.
How can we reproduce it (as minimally and precisely as possible)?
Deploy a Kubernetes cluster using Kubespray with the OpenStack external cloud provider enabled.
Observe the failure of the OpenStack cloud controller manager to start, with logs indicating DNS resolution failures similar to the ones provided above.
Note the presence of the node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule taint preventing CoreDNS from starting, which is crucial for DNS resolution by the cloud controller.
Removing the dnsPolicy parameter from the external-openstack-cloud-controller-manager-ds.yml.j2 template allows the OpenStack cloud controller pod to resolve DNS queries using the host's DNS settings. This change resolves the issue and allows the OpenStack cloud controller manager to start without errors.
It may be beneficial to align Kubespray's configuration with the official OpenStack cloud provider templates by not specifying a DNSpolicy unless necessary, to prevent such issues from occurring in deployments.
The text was updated successfully, but these errors were encountered:
can you also post the values for upstream_dns_servers and resolvconf_mode?
We have set resolvconf_mode: host_resolvconf in our cluster and configured additional upstream servers and cluster provisioning works without any problems.
Maybe I can recreate your setup and find the cause.
Thank you for your reply. I'll share the information soon.
However I'm not overriding any other parameters except from the above mentioned.
If there is extra configuration needed to be applied in order to make it work with your changes it should be documented.
Before these changes, or if i remove them manually or even use directly the official manifests everything works fine.
Nevertheless, I suggest changes like this one be made in the upstream OpenStack Cloud Controller repository (link) directly, as it's the primary source. The modifications made in the Kubespray repo for the OpenStack Cloud Controller Manager should align with the official repository to maintain consistency.
Regards
What happened?
When deploying Kubernetes using Kubespray with OpenStack as the external cloud provider, the cloud provider initialization fails with the following error:
W0212 09:05:21.997886 1 openstack.go:173] New openstack client created failed with config: Post "https://<redacted>:5000/v3/auth/tokens": dial tcp: lookup <redacted> on 10.233.0.3:53: write udp 10.233.0.3:48927->10.233.0.3:53: write: operation not permitted F0212 09:05:21.998071 1 main.go:84] Cloud provider could not be initialized: could not init cloud provider "openstack": Post "https://<redacted>:5000/v3/auth/tokens": dial tcp: lookup <redacted> on 10.233.0.3:53: write udp 10.233.0.3:48927->10.233.0.3:53: write: operation not permitted
This issue appears to be related to DNS resolution failures when the OpenStack cloud provider attempts to authenticate with the OpenStack API. The problem is linked to the DNS policy configuration introduced in commit c440106 (link to the commit) in the external-openstack-cloud-controller-manager-ds.yml.j2 template (direct link to the affected line). The CoreDNS pod cannot start due to the node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule taint, which in turn causes the OpenStack cloud controller to fail initialization.
Notably, this DNS policy setting does not align with the default configurations provided by the official OpenStack cloud provider repository, both in the Helm chart (link to chart) and the plain manifests (link to plain manifest).
What did you expect to happen?
I expected the OpenStack cloud provider to initialize successfully without DNS resolution issues. The official configurations from the OpenStack cloud provider repository do not specify a DNSpolicy, allowing pods to inherit DNS settings from the host, which seems to avoid such initialization problems.
How can we reproduce it (as minimally and precisely as possible)?
OS
uname -srm
cat /etc/os-release
Version of Ansible
Version of Python
Python 3.10.13
Version of Kubespray (commit)
64447e7
Network plugin used
cilium
Full inventory with variables
Command used to invoke ansible
ansible-playbook cluster.yml --become -i inventory/$K8S_CLUSTER_NAME/tf_state_kubespray.py -e @inventory/$K8S_CLUSTER_NAME/$K8S_CLUSTER_NAME.yaml -e @inventory/$K8S_CLUSTER_NAME/no_floating.yml -e "ansible_ssh_private_key_file=/home/ansible/keys/generic_vm_id_rsa" -e external_openstack_lbaas_floating_network_id=$KUBESPRAY_FLOATING_NETWORK_ID -e external_openstack_lbaas_subnet_id=$KUBESPRAY_PRIVATE_SUBNET_ID
Output of ansible run
Anything else we need to know
Removing the dnsPolicy parameter from the external-openstack-cloud-controller-manager-ds.yml.j2 template allows the OpenStack cloud controller pod to resolve DNS queries using the host's DNS settings. This change resolves the issue and allows the OpenStack cloud controller manager to start without errors.
It may be beneficial to align Kubespray's configuration with the official OpenStack cloud provider templates by not specifying a DNSpolicy unless necessary, to prevent such issues from occurring in deployments.
The text was updated successfully, but these errors were encountered: