Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation of 1.30/stable on debian-12-genericcloud-amd64.qcow2 fails #4515

Open
Aaron-Ritter opened this issue Apr 30, 2024 · 12 comments
Open

Comments

@Aaron-Ritter
Copy link

Summary

Using the cloud image https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-amd64.qcow2 and installing snap with the newest core does not bring online the microk8s 1.30/stable processes.

What Should Happen Instead?

Using such a base image it should just come online.

Reproduction Steps

get https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-amd64.qcow2

bring VM online with 2 internal interfaces e.g. 10.1.1.10/24 10.1.2.10/24

install microk8s
sudo apt-get update
sudo apt-get install snapd
sudo snap install core
sudo snap install microk8s --classic --channel=1.30/stable
sudo usermod -a -G microk8s foo
mkdir -p ~/.kube
chmod 0700 ~/.kube

status returns not running

foo@k8s-test-m1:~$ microk8s.status
microk8s is not running. Use microk8s inspect for a deeper inspection.
foo@k8s-test-m1:~$ microk8s.status
microk8s is not running. Use microk8s inspect for a deeper inspection.
foo@k8s-test-m1:~$ microk8s.status
microk8s is not running. Use microk8s inspect for a deeper inspection.

running kubectl get all -A shows no pod getting started and the API going offline again and again

foo@k8s-test-m1:~$ microk8s.kubectl get all -A
NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP                  2m54s
kube-system   service/kube-dns     ClusterIP   10.152.183.10   <none>        53/UDP,53/TCP,9153/TCP   2m51s

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/calico-node   0         0         0       0            0           kubernetes.io/os=linux   2m53s

NAMESPACE     NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-kube-controllers   0/1     0            0           2m53s
kube-system   deployment.apps/coredns                   0/1     0            0           2m51s

foo@k8s-test-m1:~$ microk8s.kubectl get all -A
NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP                  2m57s
kube-system   service/kube-dns     ClusterIP   10.152.183.10   <none>        53/UDP,53/TCP,9153/TCP   2m54s

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/calico-node   0         0         0       0            0           kubernetes.io/os=linux   2m56s

NAMESPACE     NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-kube-controllers   0/1     0            0           2m56s
kube-system   deployment.apps/coredns                   0/1     0            0           2m54s

foo@k8s-test-m1:~$ microk8s.kubectl get all -A
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?

Introspection Report

inspection fails partially as there is no pod starting

foo@k8s-test-m1:~$ microk8s.inspect 
Inspecting system
Inspecting Certificates
Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-kubelite is running
  Service snap.microk8s.daemon-k8s-dqlite is running
  Service snap.microk8s.daemon-apiserver-kicker is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy openSSL information to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy asnycio usage and limits to the final report tarball
  Copy inotify max_user_instances and max_user_watches to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster
Inspecting dqlite
  Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6782/var/kubernetes/backend/localnode.yaml': No such file or directory

Building the report tarball
  Report tarball is at /var/snap/microk8s/6782/inspection-report-20240430_130646.tar.gz

Can you suggest a fix?

1.28/stable works

Are you interested in contributing with a fix?

yes, testing

@neoaggelos
Copy link
Member

Hi @Aaron-Ritter, could this be the same issue as #4361?

@Aaron-Ritter
Copy link
Author

@neoaggelos likely, cant deploy 1.29/stable either

@Aaron-Ritter
Copy link
Author

This does the trick for 1.30/stable too:

#4361 (comment)

echo '
--cgroups-per-qos=false
--enforce-node-allocatable=""
' | sudo tee -a /var/snap/microk8s/current/args/kubelet

sudo snap restart microk8s.daemon-kubelite

But 1. i don't think its something we should disable and 2. https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ (DEPRECATED: This parameter should be set via the config file specified by the kubelet's --config flag. See kubelet-config-file for more information.)

@neoaggelos
Copy link
Member

Can you try installing from 1.30/edge? We have a fix that should solve the related cgroups issues #4503 but it's not yet found its way to 1.30/stable.

Should be something like

sudo snap install microk8s --classic --channel 1.30/edge

@Aaron-Ritter
Copy link
Author

Aaron-Ritter commented Apr 30, 2024

yep! this works, the node is online.

Whats the downside of this fix with delegate.conf?

And is it already clear by when it is expected to be in stable?

@neoaggelos
Copy link
Member

Apparently, this is some sort of regression in Kubernetes, see kubernetes/kubernetes#122955. kubelet is responsible for ensuring cgroups have all required controllers, but apparently that does not work as expected.

@Aaron-Ritter
Copy link
Author

Hm ok, so does that mean that the workaround will potentially bite itself once the underlying kubelet issue is solved?

I will try the test cluster setup with 1.30/edge for now and deploy all other components i usually use and see if there are any other problems with it.

@Aaron-Ritter
Copy link
Author

@neoaggelos are you able to comment on the possible release ETA for the working edge version?

@neoaggelos
Copy link
Member

Hi @Aaron-Ritter see #4361 (comment).

We are in the process of promoting the fix. It should already be out in 1.29/stable, and 1.30/stable is expected to follow before the end of this week. Thank you for testing!

@Aaron-Ritter
Copy link
Author

@neoaggelos Is there a way to track the releases of patch versions? Meaning the details of the different patch versions like list of features / fixes / issues solved by the patched version?

@fmiqbal
Copy link

fmiqbal commented May 20, 2024

@Aaron-Ritter I am leaning towards checking the commit merge date into main, and comparing that to last channel update in snap info microk8s

snap-id:      EaXqgt1lyCaxKaQCU349mlodBkDCXRcg
tracking:     1.29/stable
refresh-date: 5 days ago, at 15:29 WIB
channels:
  1.29/stable:           v1.29.4  2024-05-07 (6809) 170MB classic
  1.29/candidate:        v1.29.4  2024-05-03 (6809) 170MB classic
  1.29/beta:             v1.29.4  2024-05-03 (6809) 170MB classic
  1.29/edge:             v1.29.5  2024-05-15 (6837) 170MB classic
  latest/stable:         v1.29.0  2024-01-25 (6364) 168MB classic
  latest/candidate:      v1.30.1  2024-05-17 (6844) 168MB classic
  latest/beta:           v1.30.1  2024-05-17 (6844) 168MB classic
  latest/edge:           v1.30.1  2024-05-15 (6844) 168MB classic
  1.30-strict/stable:    v1.30.0  2024-04-18 (6783) 168MB -
  1.30-strict/candidate: v1.30.0  2024-04-18 (6783) 168MB -
  1.30-strict/beta:      v1.30.0  2024-04-18 (6783) 168MB -
  1.30-strict/edge:      v1.30.1  2024-05-15 (6843) 168MB -
  1.30/stable:           v1.30.0  2024-04-18 (6782) 168MB classic
  1.30/candidate:        v1.30.0  2024-05-10 (6813) 168MB classic
  1.30/beta:             v1.30.0  2024-05-10 (6813) 168MB classic
  1.30/edge:             v1.30.1  2024-05-15 (6842) 168MB classic

The PR merged on 2024-04-19, so I suppose 1.29/stable has the update, because its updated on may, but 1.30/stable still doesn't have it because its last updated on 2024-04-18

@Aaron-Ritter
Copy link
Author

@fmiqbal that's what i thought too, but then i would love to see release notes for each patch, containing all changes.

The major release is visible in this repository with some further details, but the patches i didn't find details about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants