Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"[ERROR] Unable to start all processes" #140

Open
siers opened this issue Nov 25, 2019 · 16 comments
Open

"[ERROR] Unable to start all processes" #140

siers opened this issue Nov 25, 2019 · 16 comments
Labels
bug Something isn't working

Comments

@siers
Copy link

siers commented Nov 25, 2019

% sudo ./target/release/kubernix --nodes=2
[sudo] password for s:
[ERROR] Unable to start all processes
[⠒  8s] █████████████████░░░░░░░░ 20/28 Controller Manager is ready
[   0s] █████████████████████████  9/9 Cleanup done

This happens on NixOS 19.09.

Built from the latest v0.2.0 release.

@issue-label-bot issue-label-bot bot added the bug Something isn't working label Nov 25, 2019
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label bug to this issue, with a confidence of 0.97. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@saschagrunert
Copy link
Owner

saschagrunert commented Nov 25, 2019

Okay thanks for the bug report, I’ll try to reproduce in the next couple of days.

@siers
Copy link
Author

siers commented Nov 25, 2019

I don't know whether you can call it a "big" report, but I can provide more details if necessary.

@saschagrunert
Copy link
Owner

It was a typo and should be a bug report. Sure, I’ll check first if the bug is something obvious...

@siers
Copy link
Author

siers commented Nov 26, 2019

Running it with podman brings me a [ERROR] Unable to wait for coredns pod. Not sure if I should create a new issue for this or not.

@saschagrunert
Copy link
Owner

Running it with podman brings me a [ERROR] Unable to wait for coredns pod. Not sure if I should create a new issue for this or not.

Can you try to run an sudo iptables -F and bootstrap the cluster again?

@siers
Copy link
Author

siers commented Nov 27, 2019

iptables -F helped for #144, but not this issue

@saschagrunert
Copy link
Owner

Hm, can you test the latest master and see if the problem still happens?

@siers
Copy link
Author

siers commented Dec 2, 2019

while on 16e7aae:

[DEBUG] Found pattern 'etcd ok' in line '[+]etcd ok'
[DEBUG] Creating API Server RBAC rule for kubelet
[DEBUG] API Server RBAC rule created
[DEBUG] No previous run file '/home/s/code/machines/2019-11-22-kubernetes/kubernix/kubernix-run/controllermanager/run.yml' found, writing new one
[DEBUG] No previous run file '/home/s/code/machines/2019-11-22-kubernetes/kubernix/kubernix-run/scheduler/run.yml' found, writing new one
[DEBUG] Waiting for process 'Controller Manager' (kube-controller-manager) to become ready with pattern: 'Serving securely'
[DEBUG] Waiting for process 'Scheduler' (kube-scheduler) to become ready with pattern: 'Serving securely'
[DEBUG] Found pattern 'Serving securely' in line 'I1202 15:10:34.335029   10065 secure_serving.go:123] Serving securely on [::]:10259'
[DEBUG] Found pattern 'Serving securely' in line 'I1202 15:10:34.868221   10063 secure_serving.go:123] Serving securely on [::]:10257'
[ERROR] Kubelet node-0 (podman) died unexpectedly
[DEBUG] Kubelet node-0 (podman) exit code: 255
[ERROR] Kubelet node-1 (podman) died unexpectedly
[DEBUG] Kubelet node-1 (podman) exit code: 255
[DEBUG] No previous run file '/home/s/code/machines/2019-11-22-kubernetes/kubernix/kubernix-run/proxy/run.yml' found, writing new one
[DEBUG] Waiting for process 'Proxy' (kube-proxy) to become ready with pattern: 'Caches are synced'
[ERROR] Proxy (kube-proxy) died unexpectedly
[DEBUG] Proxy (kube-proxy) exit code: 1
[DEBUG] kube-proxy (Proxy) died
[DEBUG] podman (Kubelet node-0) died
[DEBUG] podman (Kubelet node-1) died
[ERROR] Unable to start all processes
[⠦  3m] ███████████████████░░░░░░ 22/28 Starting Proxy

full log.txt

@saschagrunert
Copy link
Owner

Can you provide me the logs of the kube-proxy please? :)

@siers
Copy link
Author

siers commented Dec 2, 2019

How do I do that? :)

@saschagrunert
Copy link
Owner

How do I do that? :)

The log should be located in kubernix-run/proxy/kube-proxy.log. 🙃

@siers
Copy link
Author

siers commented Dec 2, 2019

W1202 15:10:47.581977   10283 proxier.go:584] Failed to read file /lib/modules/4.19.84/modules.builtin with error open /lib/modules/4.19.84/modules.builtin: no such file or directory. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
E1202 15:10:47.606142   10283 node.go:124] Failed to retrieve node info: nodes "node-0" not found
E1202 15:10:48.762910   10283 node.go:124] Failed to retrieve node info: nodes "node-0" not found
E1202 15:10:51.088002   10283 node.go:124] Failed to retrieve node info: nodes "node-0" not found
E1202 15:10:55.344317   10283 node.go:124] Failed to retrieve node info: nodes "node-0" not found
E1202 15:11:03.570609   10283 node.go:124] Failed to retrieve node info: nodes "node-0" not found
F1202 15:11:03.570641   10283 server.go:443] unable to get node IP for hostname node-0

@saschagrunert
Copy link
Owner

Ah, the issue might be related because we're trying to edit the hosts file which is readonly on nixos 🤔

@saschagrunert
Copy link
Owner

As a workaround, can you point node-0 to node-X via /etc/hosts to localhost? Then the bootstrap should work...

@siers
Copy link
Author

siers commented Dec 2, 2019

still doesn't work, kube-proxy logs say the same thing

% cat /etc/hosts
127.0.0.1 localhost 
::1 localhost
127.0.0.1 self node-0 node-1 node-2 node-3 node-4 node-5 node-6 node-7 node-8 node-9

(though I really had only two nodes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants