Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hono] dispatch-router Pod failing to start HTTP server #312

Open
jlengelsen opened this issue Oct 15, 2021 · 6 comments
Open

[Hono] dispatch-router Pod failing to start HTTP server #312

jlengelsen opened this issue Oct 15, 2021 · 6 comments
Labels

Comments

@jlengelsen
Copy link

The dispatch-router Pod fails to start the HTTP server when installing either the cloud2edge or the hono Helm package into a Kubernetes cluster provisioned with kind on my machine (Fedora 34). It seems like the Pod is trying to allocate 1073741816 file descriptors which is exactly the ulimit of the host OS (default on Fedora).

These are the pod logs related to the issue:

HTTP (error) OOM allocating 1073741816 fds
HTTP (error) ZERO RANDOM FD
SERVER (critical) No memory starting HTTP server
...
SERVER (error) No HTTP support to listen on 0.0.0.0:8088
...

Any ideas how i can fix that?

@calohmn calohmn added the Hono label Oct 19, 2021
@calohmn
Copy link
Contributor

calohmn commented Oct 19, 2021

This seems related to DISPATCH-1897 and warmcat/libwebsockets#2449. There, the recommendation is to adapt the ulimit value.
In libwebsockets v4.2.0 and newer there seems to be a fix, but even the newest dispatch router image (quay.io/interconnectedcloud/qdrouterd:1.17.0) is still using an older version (3.2.1-1.fc30).

@jlengelsen
Copy link
Author

This seems related to DISPATCH-1897 and warmcat/libwebsockets#2449. There, the recommendation is to adapt the ulimit value. In libwebsockets v4.2.0 and newer there seems to be a fix, but even the newest dispatch router image (quay.io/interconnectedcloud/qdrouterd:1.17.0) is still using an older version (3.2.1-1.fc30).

Confirmed. I have built the qdrouterd image locally with libwebsockets v4.2.2 and the error was gone. Thanks for the hint.

@sophokles73
Copy link
Member

@jlengelsen can this issue be closed?

@jlengelsen
Copy link
Author

@jlengelsen can this issue be closed?

Well, the issue is not solved yet. Installing either the cloud2edge or the hono Helm package into a cluster where no sane ulimits for containers are set still fails. In order to solve the issue the qdrouterd image that is used in the packages has to upgrade to libwebsockets v4.2.0 or newer which isn't the case yet.

@sophokles73
Copy link
Member

It seems that it is not in the hands of the Hono not the IoT Packages project to resolve the issue. There even seem to be different opinions as to whether this is actually a bug/problem or works as designed. In any case, until the Qpid project decides to use libwebsocket >= 4.2.0, it looks like setting ulimits as advised in warmcat/libwebsockets#1769 is a reasonable workaround, or doesn't it?

@jlengelsen
Copy link
Author

Agreed, it is the project maintainers' decision whether this issue should be tracked here. Setting ulimits works but since there is no easy way implemented yet to set ulimits for containers running in a k8s cluster (kubernetes/kubernetes#3595) it is annoying to do...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants