Skip to content
This repository has been archived by the owner on Dec 17, 2021. It is now read-only.

DockerRunner performance improvement #142

Closed
das7pad opened this issue Nov 28, 2019 · 3 comments
Closed

DockerRunner performance improvement #142

das7pad opened this issue Nov 28, 2019 · 3 comments

Comments

@das7pad
Copy link
Member

das7pad commented Nov 28, 2019

In the following substitute project with project x user for jailed compiles (disablePerUserCompiles in the web config).

The clsi app creates a new container for each unique command and project. A command could be a compile, word count, sync from pdf, sync from code request.

The first two commands scale OK, as there is only one container per project for the word count and one container per project and doc for the compile request. But the sync commands create a container for each unique lookup - unique per line x column x project (x doc for sync from code).

For the following examples I am using a VPS with dedicated threads of a fairly new server CPU (latest Xeon E5 clocked at 2.4Ghz). The VPS is running CentOS 7, Docker version 19.03.5, build 633a0ea and containerd 1.2.10 b34a5c8af56e510852c35414db4c1f4fa6172339, overlay2 storage driver on xfs. The source of the docker image is hosted on Github.

Creating and starting a new container has a time penalty of about 1000ms.
# time docker run sharelatex/texlive:2017.1-full true

real	0m0.982s
Starting an existing container has a time penalty of about 600ms.
# docker run --name project-xxx sharelatex/texlive:2017.1-full true
# time docker start project-xxx
project-xxx

real	0m0.612s

Another method to run commands in containers is to split the creation of the container and the command execution into two stages.
The first stage creates an idle container [1]. And the second stage uses this idle container as a jailed environment to run commands.

Creating an idle container has a time penalty of about 700ms.
# time docker run -d sharelatex/texlive:2017.1-full sleep 180000
18b229d06cb0d1484b71874bd27c6f5a09a70710b411eadba38d0bf2f8546fbf

real	0m0.645s
Running a command in the idle container has a time penalty of about 300ms.
# time docker exec 076c085fabe08d7a6c6adc5ef3a717135b3ead71af8e9e2ba401ec565a3eb87d true

real	0m0.305s

The second method still has a delay of about 1000ms for the first command per project. But every following command runs at a must faster pace: recompile from 600ms down to 300ms, sync from pdf/code from 1000ms down to 300ms.

Resource usage of idle containers:

metric usage comment
RAM 5.25MB 200 container consume about 1050MB
CPU time 650ms average of the sum of all processes after an hour
PIDs 11 10 PIDs for containerd + 1 PID for cat/sleep

I am aware of your request distribution/scaling via a load agent, haproxy + a cookie, cookie storage in redis per project. For the proposed method the load agent may have to incorporate the memory (and PID [2]) usage as well. It really depends on the resource capacity of your worker nodes.

What do you think about this proposal?

cc @briangough @emcsween @henryoswald @mans0954 @mmazour @ShaneKilkelly


[1]: Using sleep X with a high X - it is prone to race conditions with a low timeout: once the container is started, there is no way to extend the delay in order to preserve to container for another command. X=(DockerRunner.MAX_CONTAINER_AGE=60*60)*50=180000 should give the cleanup task enough tries to cycle the container before it times out. Another option here is to use cat and a fake stdin/tty for the container, but this uses more resources - jailed stdin file handler and polling by cat.
[2]: The PID limit can be bumped up to 4 million via echo X > /proc/sys/kernel/pid_max .

@JuneKelly
Copy link

Hi! Thanks for doing the investigation on this, it's nice to see some concrete numbers!

We've had a quick discussion with the team and decided that the reason we didn't go for this pattern was that it would have it's own resource trade-offs (keeping containers alive in an idle-loop even while they're not necessarily active).

At the moment we have logic to re-use an existing container if it's present on the system, so the initial cold-start cost should only be paid rarely, and most actions should be relatively fast (as you've discovered).

Another thing that comes to mind, which would need to be investigated, is the kind of locking and synchronization logic that might need to be introduced to prevent overlapping commands from clobbering each other in the one container.

I think we'll park this for the time being and keep it in mind for optimizing our production load in the future. In the meantime, do you have a sense of how this would change the performance characteristics on your own system under normal workloads?

@das7pad
Copy link
Member Author

das7pad commented Dec 2, 2019

Hello @ShaneKilkelly,

thank you for bringing this up with the team!

There is already logic in place to prevent two compile requests from running in parallel on the same project/project+user directory 1. The other commands do not create new files and should be safe to run without any additional locking.
What's left is the creation and usage of container which need to be guarded from the container garbage collector. Note that #141 is important in this context. A parallel creation can be detected in the status code of 409 2, parallel calls on container.start result in a 304 3 - there is no need for a lock here.

In order to support the switching of the texlive image and more broader container option changes, I would suggest to keep the suffix in the container name. A hash of the container options, but without the command now.

The command timeout and kill logic both need some tweaking to support the reuse of container.
The timeout is currently implemented in JS with a setTimeout handler and a container.kill call. We could use timeout(1) from coreutils 4 instead. It signals a passed timeout with an exit code of 124 and forwards any exit codes on the happy route.
The kill logic is used only by the compile request? We could keep track of the Docker exec instance ID 5 which can be used to query the PID of the spawned process 6.

Now that I finished my studies - yay - my sharelatex setup is just under synthetic load and serving as a distributed system to experiment/work with. I will probably see no performance impact from this, but I will observe slightly better response times on paper 😉 and eventually notice it when working on the frontend/editor.

@JuneKelly JuneKelly removed their assignment Feb 13, 2020
@das7pad
Copy link
Member Author

das7pad commented Aug 6, 2021

Hi!

Thank you for taking the time to write up this issue.

We are in the process of migrating to a monorepo at https://github.com/overleaf/overleaf and will mark this repository read-only soon.
You can read more about the monorepo migration at overleaf/overleaf#923.

We are going to close this issue now to avoid any confusion about the inability to comment further.

If you believe this issue still needs addressing, please create a new issue at https://github.com/overleaf/overleaf.

Thanks again!

@das7pad das7pad closed this as completed Aug 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants