Limit request queue to fail fast #50

alpe · 2024-01-09T11:01:17Z

Incoming requests are queued in memory until capacity on a serving backend becomes available. This can be critical in a peak load or DoS scenarios. Instead of having this unbound, we should fail fast and reject new requests with StatusServiceUnavailable (503).
The total queue limit could be dynamic and/or fix value (due to memory limitations).

For dynamic calculations: factor * total_number_of_replicas * concurrent_requests_per_replica . The factor should be defined in context of the time required to scale up instances. I think, I saw 10x somewhere in a similar project but I can not find the number now. Would be a good start parameter to costumize for different environments.

The text was updated successfully, but these errors were encountered:

samos123 · 2024-01-13T06:59:55Z

I guess if you have large requests and provide Lingo as a public service this would be a real concern. Let's assume each lingo instance can have 60k open connections max and each request is 1 MB then you would need 60GB of memory to hold those requests.
Someone that runs a large public Lingo instance might have other DDoS protections in-place on top of Lingo and in that case wouldn't need this feature (e.g. an API gateway or other software that includes such protection).

My vote would be to postpone this until we have a user that runs Lingo on a public endpoint. I am not against including this though. @nstogner your thoughts?

If you are implementing this, I would want a default of unlimited or a number so large that a user with plenty memory and no malicious actors (e.g. internal lingo) wouldn't encounter an error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit request queue to fail fast #50

Limit request queue to fail fast #50

alpe commented Jan 9, 2024

samos123 commented Jan 13, 2024 •

edited

Limit request queue to fail fast #50

Limit request queue to fail fast #50

Comments

alpe commented Jan 9, 2024

samos123 commented Jan 13, 2024 • edited

samos123 commented Jan 13, 2024 •

edited