You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a streetmix user, I want to visit the site without the app timing out
We keep having a solid percentage of users experience timeouts and other connection drops.
This is a small percentage, but if the numbers of users scales its quite possible this percentage will trend in the wrong direction.
We've tried a couple things so far:
Optimizing queries, adding indexes to make requests complete faster
upgrading dynos and database to improve the performance of the app
It was important to rule those things out, but they have not totally solved the problem.
What else is on the list of potential causes(in order of likleyhood):
With H12 - Request Timeout errors, we generally see this pattern where one long-running action starts hogging the queue which in turn affects any subsequent requests.
Our router will drop a long-running request after 30 seconds, but the dyno behind it will continue processing the request until completion. Our router is unaware of it, though, so it'll dispatch new requests to that busy dyno. This effect tends to compound, and you'll eventually see H12 errors even for unrelated URLs, such as static assets. H13 errors are similar in what causes them, but are primarily related to concurrent web servers.
If your app is using ExpressJS, you will also want to install something like timeout, which will ensure that a long running request is dropped at the dyno-level as well. Specifically, timeout raise a Response timeout exception when that happens.
With that in place, the compound effect is less likely to occur, but long-running actions still need to be addressed.
After more research, I'm feeling more convinced that the first one on this list is a likely culprit, mainly because our total number of users and the complexity of their requests is still pretty low in the grand scheme of things.
Replicating the issue is hard without simulating a bunch of users. We could do something with https://locust.io/ on staging to try and replicate and test the issue without effecting production.
The text was updated successfully, but these errors were encountered:
As a streetmix user, I want to visit the site without the app timing out
We keep having a solid percentage of users experience timeouts and other connection drops.
This is a small percentage, but if the numbers of users scales its quite possible this percentage will trend in the wrong direction.
We've tried a couple things so far:
It was important to rule those things out, but they have not totally solved the problem.
What else is on the list of potential causes(in order of likleyhood):
After more research, I'm feeling more convinced that the first one on this list is a likely culprit, mainly because our total number of users and the complexity of their requests is still pretty low in the grand scheme of things.
Implementing timeout middleware
expressjs/express#3330 <--- may be quick and possible to do just with express
http://expressjs.com/en/resources/middleware/timeout.html <--- middleware example
Replicating the issue is hard without simulating a bunch of users. We could do something with https://locust.io/ on staging to try and replicate and test the issue without effecting production.
The text was updated successfully, but these errors were encountered: