Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official pursuit server performance is not good #482

Open
flip111 opened this issue Dec 2, 2023 · 5 comments
Open

Official pursuit server performance is not good #482

flip111 opened this issue Dec 2, 2023 · 5 comments

Comments

@flip111
Copy link

flip111 commented Dec 2, 2023

Problem

Since a few weeks, maybe even months i'm see slow results and frequent 502 Bad Gateway not available. I asked on discord and other people also have problems. I think nobody made an issue before because if you just wait a bit and try again then often is ok. Though this issue could be related #480

Findings

I downloaded the source code and tested it under load. What i see is this

 .stack-work/dist/x86_64-linux/ghc-9.2.5/build/pursuit/pursuit +RTS -p                                                                                                                                                                                                   [0:22:11]
[Warn] No GitHub auth token configured (environment variable is: PURSUIT_GITHUB_AUTH_TOKEN)
[Warn] Requests to the GitHub API will be performed with no authentication.
[Warn] This will probably result in rate limiting.
Starting in development mode
PURSUIT_CLIENT_SESSION_KEY=Ymb0FVgpNlybkO5MxbHL4DnbZkZ2zJrDHxjcS3Sn4TlquQJv23rPGr5oxNqoVzhDWsw7hZ3YERy083YCXDokhZrguj1VPY2A51UB+0/Biy89PJViBDsxiqjbrKc5G+LB
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
GET /
  Accept:
  Status: 500 Internal Server Error 0.058559736s
GET /
  Accept:
  Status: 500 Internal Server Error 0.006288454s
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
GET /
  Accept:
  Status: 500 Internal Server Error 0.006520935s
03/Dec/2023:00:23:37 +0100 [Error#yesod-core] ./data/cache//index.html: withBinaryFile: resource busy (file is locked) @(yesod-core-1.6.24.0-9BHxG0hJFr59B2j5jcLrfq:Yesod.Core.Class.Yesod src/Yesod/Core/Class/Yesod.hs:705:6)
<...snip....>

Investigation & thoughts

This is the only place in the source code i found "index.html" https://github.com/purescript/pursuit/blob/master/src/Handler/Caching.hs#L65-L69

when i look where the functions cacheHtml and cacheHtmlConditional are used i see function names like

  • getHelpR
  • getHelpAuthorsR
  • getHelpUsersR
  • getHomeR
  • getPackageVersionR
  • getPackageVersionModuleDocsR

Which i think are all different pages on pursuit.

I don't understand this caching mechanism. Why would you cache to the same file for different pages?

When i let pursuit run in single threaded mode with +RTS -N1 i still get contention over the index.html but far less. I guess yesod still has some form of concurrency going on as time division multiplexing several requests on a single thread.

Performance on server

The official pursuit server is also struggling a lot (though it could be underspecced i don't know) @JordanMartinez perhaps you could have a look over a longer time window than 7 days?
cpu
memory

Reproduction

To reproduce load testing and profile deeper in time spend per function and memory allocated per function i used the following.

wrk2.dockerfile

FROM alpine
WORKDIR /wrk2
RUN apk add --no-cache git openssh make gcc musl-dev libressl-dev zlib-dev && \
    git clone https://github.com/giltene/wrk2.git /wrk2 && \
    make -j$(nproc)
ENTRYPOINT ["./wrk"]

profile_memory.sh

#!/usr/bin/env bash

TIME=60

docker build -t wrk2 --file wrk2.dockerfile .

stack build --profile
# stack exec pursuit --RTS -- +RTS -p & # didn't work
.stack-work/dist/x86_64-linux/ghc-9.2.5/build/pursuit/pursuit +RTS -p -N1 &
RETVAL=$?
PID=$!
docker run --rm --name wrk2 --network="host" wrk2 --rate 10 --duration ${TIME}s --threads $(nproc) --connections 400 --latency  http://localhost:3000
kill -INT $PID
sleep 1
profiteur pursuit.prof
xdg-open pursuit.prof.html

For the last two lines you need to install this tool https://github.com/jaspervdj/profiteur (which i recommend). Otherwise remove the the last two lines. When you open the html double click with your mouse on the big color area a lot of times to drill down quickly.


@hdgarrood do you have any idea what is happening here?

@JordanMartinez
Copy link
Contributor

Just FYI. Pursuit-related issues are not my current focus right now. So, I won't be getting to this anytime soon. Also, hdgarrood is no longer on the core team. Please don't ping him about something for which he no longer has responsibility.

@flip111
Copy link
Author

flip111 commented Dec 25, 2023

Anyone else with access to the server who could look into this?

@JordanMartinez
Copy link
Contributor

I think me and Thomas do, and we're both focused on other things right now.

@thomashoneyman
Copy link
Member

Yes, I'm not able to focus on Pursuit right now and am not experienced with the code base (I mostly just deploy updates for compilers).

Since you did your investigation we doubled the size of the server and its performance has improved significantly. That said, it's a bandaid for the problems in the code you've identified and it's not an acceptable long term solution.

I'm happy to help shepherd fixes to Pursuit along if you are able to identify cache issues or other performance issues in the code. We can get that code deployed.

@flip111
Copy link
Author

flip111 commented Dec 29, 2023

Let's see how the new server is doing, perhaps it's enough for the moment. In the mid-near future we want to move to a new purescript codebase anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants