Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry Pi pods stuck in: failed to create shim task: OCI runtime create failed ... unified resource "memory.oom.group" can't be set #4439

Open
ming-afk opened this issue Feb 23, 2024 · 2 comments

Comments

@ming-afk
Copy link

Summary

Trying to deploy any pod on microk8s installed in local raspberrypi machine finds the pod stuck in CrashLoopBackOff. Describe the pod finds below message:

Message:      failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unified resource "memory.oom.group" can't be set: controller "memory" not available: unknown

What Should Happen Instead?

pod should deploy with eventual status "running"

Reproduction Steps

  1. sudo snap install microk8s --channel=1.28-strict/stable

  2. sudo usermod -a -G snap_microk8s $USER
    mkdir -p ~/.kube
    sudo chown -f -R $USER ~/.kube
    newgrp snap_microk8s
    sudo snap alias microk8s.kubectl kubectl
    sudo snap alias microk8s.helm helm
    sudo microk8s start

  3. I deploy a busybox pod:

microk8s kubectl apply -f debug.yaml

// debug.yaml

apiVersion: v1
kind: Pod
metadata:
  name: debug5
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["sh",  "-c", "sleep 3600"]

but then found the pod crashes

(base) minghaoli@raspberrypi:~/Downloads $ k get pods -o wide
NAME     READY   STATUS             RESTARTS          AGE   IP          NODE          NOMINATED NODE   READINESS GATES
debug5   0/1     CrashLoopBackOff   226 (3m41s ago)   18h   10.1.87.3   raspberrypi   <none>           <none>

Introspection Report

inspection-report-20240222_200342.tar.gz

Inspecting system
Inspecting Certificates
Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-kubelite is running
  Service snap.microk8s.daemon-flanneld is running
  Service snap.microk8s.daemon-etcd is running
  Service snap.microk8s.daemon-apiserver-kicker is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy openSSL information to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy asnycio usage and limits to the final report tarball
  Copy inotify max_user_instances and max_user_watches to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster

Building the report tarball
  Report tarball is at /var/snap/microk8s/6565/inspection-report-20240222_200342.tar.gz

Other inspections

Checking describe pod:

k describe pod -n default     debug5
Name:             debug5
Namespace:        default
Priority:         0
Service Account:  default
Node:             raspberrypi/10.23.100.84
Start Time:       Thu, 22 Feb 2024 01:20:54 -0500
Labels:           <none>
Annotations:      <none>
Status:           Running
IP:               10.1.87.3
IPs:
  IP:  10.1.87.3
Containers:
  busybox:
    Container ID:  containerd://7e86132b86c46cc79a6bce2ff9fa4f166c0314c7e71944a64ce2917d3f790323
    Image:         busybox
    Image ID:      docker.io/library/busybox@sha256:6d9ac9237a84afe1516540f40a0fafdc86859b2141954b4d643af7066d598b74
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      sleep 3600
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       StartError
      Message:      failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unified resource "memory.oom.group" can't be set: controller "memory" not available: unknown
      Exit Code:    128
      Started:      Wed, 31 Dec 1969 19:00:00 -0500
      Finished:     Thu, 22 Feb 2024 19:54:19 -0500
    Ready:          False
    Restart Count:  226
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q9l94 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-q9l94:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Warning  BackOff  4m19s (x5068 over 18h)  kubelet  Back-off restarting failed container busybox in pod debug5_default(8973c842-3322-4a26-8251-fbaf1f9f14c6)

cgroup settings

console=serial0,115200 console=tty1 root=PARTUUID=82c74a33-02 rootfstype=ext4 fsck.repair=yes rootwait quiet splash plymouth.ignore-serial-consoles
cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1 swapaccount=1
@kjenney
Copy link

kjenney commented Mar 26, 2024

I had the same issue and ended up resolving it by reinstall microk8s on the node that I was having the issue on - after removing the node from the cluster. My iniital issue was that cgroup settings were not correct. Rebooting didn't resolve it, but reinstalling microk8s did.

@dekkagaijin
Copy link

Enabling cgroup limits solved it for me:

sudo sed -i '$ s/$/ cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1 swapaccount=1/' /boot/firmware/cmdline.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants