Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rabbitmq takes forever to start, fails, and still eats 100% CPU after started, if ulimit -n set to a high value #491

Closed
t-lo opened this issue Sep 9, 2022 · 14 comments · May be fixed by #492
Labels
feature request Issues that request new features to be added to OnlyOffice

Comments

@t-lo
Copy link

t-lo commented Sep 9, 2022

Do you want to request a feature or report a bug?

This is a bug report which includes a workaround (see below). Motivation for filing this issue is to share this workaround (which has cost me quite a bit of debugging time) with other affected users.

What is the current behavior?

When ulimit -n inside the document server container is set to a high value (depending on the on docker config) it takes multiple minutes to start, hanging at Starting RabbitMQ Messaging Server rabbitmq-server - which eventually fails - though the container continues to run. After that, a process (or thread?) erl_child_setup consumes 100% of a single CPU, and keeps running forever. The document server container is not usable at this point (health endpoint returns 502) because rabbitmq never started successfully.

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

  1. check that ulimit -n is set to a high value inside of the onlyoffice container
    host $ docker run -ti --entrypoint /bin/bash onlyoffice/documentserver
    container $ ulimit -n
    1073741816
  2. Start the container
    host $ docker run onlyoffice/documentserver
  3. run htop on the host (which also shows container namespaced processes): shows start-stop-daemon [...] redis-server stuck for multiple minutes, consuming 100% CPU; then erl_child_setup doing the same.

What is the expected behavior?

  1. Container starts normally independent of ulimit -n setting inside of container
  2. If the start-up of any of the required components (rabbitmq, redis, documentserver, nginx, etc) fails, the container exits with an error.

Did this work in previous versions of DocumentServer?

Yes, but I'm unsure when it stopped working.

DocumentServer Docker tag:

  • dockerhub digest 8a1edcc13f9d
  • image ID 5a50e3a2d2ed

Host Operating System:

Fedora 36 w/ docker version 20.10.17, build 100c701

Workaround

Set ulimit for NOFILE to a lower value, either individually for the documentserver container or globally for all containers.

Individually: add (e.g.) --ulimit nofile=65536:65536 to the docker command line, or

   ...
   ulimits:
     nofile:
       soft: "65536"
       hard: "65536"
   ...

to your service configuration YAML for docker-compose.

Globally: Add --default-ulimit nofile=65536:65536 to the dockerd command line.

@ShockwaveNN
Copy link
Contributor

Hi, I understand thtat this issue is reported as workaround for a problem you've got

But shouldn't this issue also be reported to RabbittMQ developers? seems they got some problems with big ulimit limits?
If I understand this correctly

@t-lo
Copy link
Author

t-lo commented Sep 11, 2022

Good point, but after some more digging I found this: docker-library/rabbitmq#545
Seems to be fixed upstream.

@ShockwaveNN
Copy link
Contributor

@t-lo Thanks for finding it

We use ubuntu as our base image, so I think it will take time until ubuntu will serve the version with the fix

@t-lo
Copy link
Author

t-lo commented Sep 11, 2022

Might be worth considering setting ulimit -n 65536 explicitly in the container entry point (or the rabbitmq init script); this would work independent of a ubuntu upstream fix (and at the end of the day do the same thing the erlang code change in upstream rabbitmq does). This could even be taken from an env variable so docker users can override it if necessary.

@ShockwaveNN
Copy link
Contributor

Thanks for this idea

I've create issue 58989 in our private issue tracker

Not sure if we will implement that, but at least we discuss it

@ShockwaveNN ShockwaveNN added the feature request Issues that request new features to be added to OnlyOffice label Sep 11, 2022
@t-lo
Copy link
Author

t-lo commented Sep 12, 2022

I took a stab on this, see #492.
It's more challenging than I thought since ubuntu 20.04 ignores /etc/security/limits.[conf|d/] in favour of systemd service file settings (LimitNOFILE=...) but the documentserver uses start-stop-daemon to run, ignoring systemd's limits in turn.
So /etc/default/rabbitmq-server seems like the best place to set this.

@ShockwaveNN
Copy link
Contributor

@t-lo Ok thanks, I'll notifiy our developers team

@igwyd
Copy link
Member

igwyd commented Dec 2, 2022

Hello @t-lo, it's fixed at: #530 and will be released in the next release.

@t-lo
Copy link
Author

t-lo commented Dec 2, 2022

Thank you @igwyd ! What's the ETA of the next release?

(Also, I've updated PR #492 with a comment, feel free to close.)

@igwyd
Copy link
Member

igwyd commented Dec 2, 2022

No release date yet.

@mkobel
Copy link

mkobel commented Jun 28, 2023

As workaround I defined ulimits in the docker-compose file:

    ulimits:
      nofile: 65536

@igwyd
Copy link
Member

igwyd commented Jun 28, 2023

Hello @t-lo, as far as I can see the problem is solved, can we close it?

@Heath123
Copy link

Heath123 commented Aug 10, 2023

I've realised that the OnlyOffice development server has been installed on my laptop for over a year and it must have been taking up a whole CPU core in the background the whole time! No wonder my battery life has been bad and my fan has been loud... After getting rid of it my idle CPU temperature dropped from 70°C to 40°C and my fan is far quieter where before it would run constantly

@Rita-Bubnova
Copy link
Member

If it's resolved I'll close the issue. Feel free to comment or reopen it if you got further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to OnlyOffice
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants