Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker nodes fail to start after reboot as nf_conntrack kernel module not loaded #4462

Open
rp42 opened this issue Mar 15, 2024 · 2 comments

Comments

@rp42
Copy link

rp42 commented Mar 15, 2024

Summary

I added a node as a worker to a small cluster that had GPU enabled. It works fine initially, but on rebooting the node the microk8s.daemon-kubelite.service fails to start as it is unable to open /proc/sys/net/netfilter/nf_conntrack_max :

microk8s.daemon-kubelite[2119]: E0314 18:28:26.546586    2119 server.go:537] "Error running ProxyServer" err="open /proc/sys/net/netfilter/nf_conntrack_max: no such file or directory"

Adding nf_conntrack to the end of /etc/modules-load.d/modules.conf in the worker node VM works around the issue.

Nodes are all running Ubuntu Server 22.04.4 and microk8s v1.28.7 from snap. They run as VMs in a Proxmox cluster.

What Should Happen Instead?

Node should come up into Ready status after it is rebooted

Reproduction Steps

  1. Single node with GPU enabled, but no GPU h/w
  2. Add a GPU node as a worker to the non-GPU node
  3. Verify all nodes are ready and cluster is functional
  4. Reboot the GPU node and wait for it to return to Ready status

Introspection Report

Please contact me directly if this is required.

Can you suggest a fix?

Ensure the nf_conntrack module is loaded on worker nodes as it is on full nodes.

Are you interested in contributing with a fix?

Not sure where to fix this issue properly.

@andrew-landsverk-win
Copy link

We're running microk8s on Red Hat 9 and saw the same problem during our patching for this cycle. The suggested fix of adding nf_conntrack to modules.conf has also corrected the issue on our end. Is there a long term fix coming for this issue?

Thanks!

@geocomm-jmeunier
Copy link

We're running microk8s on Ubuntu 22.04 and saw this problem in different environments. The suggested fix of adding nf_conntrack to modules.conf has fixed our issue. We would appreciate a long-term fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants