~30s of request queuing when promoting canary to production #768

victorlin · 2023-12-18T22:25:19Z

Recently, I've noticed that promoting canary to production prevents nextstrain.org from loading for a short but noticeable amount of time.

With the latest promotion of 24ba9ee (nextstrain-server v894 → v895), I paid extra attention to this. Here is a breakdown of the time it took to load https://nextstrain.org on a web browser in two scenarios. The requests took ~30 seconds and were initiated about 10 seconds after the promotion completed successfully, meaning the total downtime was around 40 seconds:

Issue title says "local" downtime because I'm not sure if it's just my connection or if this can be observed by everyone.

The text was updated successfully, but these errors were encountered:

victorlin · 2023-12-18T22:28:26Z

The automated build of 24ba9ee on canary showed this warning (GitHub, Heroku), which may be related:

Warning: Your slug size (313 MB) exceeds our soft limit (300 MB) which may affect boot time.

tsibley · 2024-01-10T19:36:33Z

I've noticed this and believe it's due to how Heroku's routing layer switches things over a bit early when cutting between the old dynos and new dynos. I wouldn't call it downtime, though. There's a short period of time when new requests will queue up waiting for the new dyno to be ready and take longer to get a response, but no requests should fail.

I haven't looked into minimizing that time; slug size might be implicated, or our code's own startup time. I also wonder if we could have Heroku's routing layer hold on directing requests to the new dyno until after an app-level health check passes (as opposed to the dyno-level health check it seems to use now).

victorlin added the bug Something isn't working label Dec 18, 2023

tsibley changed the title ~~~30s local downtime when promoting canary to production~~ ~30s of request queuing when promoting canary to production Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

~30s of request queuing when promoting canary to production #768

~30s of request queuing when promoting canary to production #768

victorlin commented Dec 18, 2023 •

edited

victorlin commented Dec 18, 2023

tsibley commented Jan 10, 2024

~30s of request queuing when promoting canary to production #768

~30s of request queuing when promoting canary to production #768

Comments

victorlin commented Dec 18, 2023 • edited

victorlin commented Dec 18, 2023

tsibley commented Jan 10, 2024

victorlin commented Dec 18, 2023 •

edited