New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use bounded thread pool #221
Comments
Related: #87 |
EDIT: Removed suggestion to use a particular threadpool library. |
OTOH, too narrow bounds on a threadpool opens up for trivial slow-loris DoS. Tricky field navigating. |
@rawler Your point about DoS is important. Handling DoS attacks is extremely complicated. It requires sophisticated monitoring systems, real-time configuration management, trained technicians on call, and a lot of well-tested code in the Http endpoint. That code does not belong in tiny-http server, because then it wouldn't be tiny. I'm relying on a managed load balancer service to handle DoS attacks. I need tiny-http to handle moderate load spikes without falling over. There are two ways to handle request overload: load shedding and back-pressure. Load shedding is suitable only for clients that automatically retry. Some of my clients are web browsers which do not retry. That's why I need tiny-http to apply back-pressure on the load balancer. A process applies back-pressure by reading requests slower when the load is too high. The simplest way to accomplish that is to make the dispatcher wait for an available thread. One could argue that I should just configure the load balancer to limit the number of concurrent forwarded requests. Most managed load balancer services do not support that. Even when that option is available, it is less desirable than back-pressure because the operator must manually change the load balancer config every time the backend capabilities change. Manual steps allow for mistakes. With tiny-http's current behavior, a bad config means the server will fall over at the first load spike. Automating the configuration change is possible but expensive. With back-pressure, the server automatically informs the load balancer of its capabilities, in real-time. This simplifies operations. |
Another point though; we've also been bitten by #220, when forgetting to up the number of file-handles and thread-counts on a server. We would however not be helped by a bound on active "request-waiting" connections, at least we would still require raising file-descriptor and thread-count limits of the OS. The reason is that in this service, the HTTP-"clients" is a CDN, with the policy to leave at least one http-connection open between requests. With some ~150POPs in this CDN, and many load-balancer-instances in each POP, there is easily 2000+ http-sessions kept alive, that tiny-http must still handle. |
Just to be clear, I'm not opposing this suggested feature, in any way. It would not break our applications, assuming we can disable it by config, or at least provide a big number (and that threads can be configured on-demand). If it helps someone out, 👍. But, as someone bitten by #220, having a bounded thread-pool would not help in any of our applications. Instead, for us, it would end up being another limit we'd need to disable/raise, next to the actual limits on file-handles and thread-counts. |
My mail-based reply (that I wrote first) seems to have lost in traffic, so I'll repeat it here. Sorry for confusing order. The DoS-type I'm thinking about is specifically the slow-loris kind. The one that load-balancers have a hard time handling, since slow clients is a legitimate usecase. In particular, this usage pattern can also happen accidentally, by real clients for some reason being slow, which is then blocking out other clients. For this kind, neither load-shedding nor backpressure is typically a good mechanism, you simply need to implement your application to handle the fact that some clients can be slow when sending the request (especially the body, if it's non-trivial), and in consuming the response. As a more general reflection, I advocate a more nuanced view of the design-space of "backpressure". We are currently heavily relying on backpressure, in a way a "bounded threadpool"-mentality would typically break. (At least in the simplest implementation, based on N-concurrent) In our application, different requests use significantly different amounts of local resources (CPU, memory, mainly), so configuring a reasonable "N" is not feasible. Additionally, requests refer to upstream services used during the execution, and different upstreams will have different capacity to saturate local resources. We solve this by starting every request with blocking on a "resource-guard" that tracks the amount of CPU and memory currently reserved and used, and allows requests in, based on availability. Additionally, requests are fair-queued and prioritized in this backpressure-based-model, to ensure one client does not accidentally block out other clients access to the service. That said, given that this suggested bounded threadpool would be only be used to reading the request, I think our usecase would be fine. (Or, assuming the threads are constructed on-demand, we'd just have to set it to a high number). |
The DoS-type I'm thinking about is specifically the slow-loris kind. The
one that load-balancers have a hard time handling, since slow clients _is_
a legitimate usecase. In particular, this usage pattern can also happen
accidentally, by real clients for some reason being slow, which is then
blocking out other clients. For this kind, neither load-shedding nor
backpressure is typically a good mechanism, you simply need to implement
your application to handle the fact that some clients can be slow when
sending the request (especially the body, if it's non-trivial), and in
consuming the response.
As a more general reflection, I advocate a more nuanced view of the
design-space of "backpressure". We are currently heavily relying on
backpressure, in a way a "bounded threadpool"-mentality would typically
break. (At least in the simplest implementation, based on N-concurrent) In
our application, different requests use significantly different amounts of
local resources (CPU, memory, mainly), so configuring a reasonable "N" is
not feasible. Additionally, requests refer to upstream services used during
the execution, and different upstreams will have different capacity to
saturate local resources.
We solve this by starting every request with blocking on a "resource-guard"
that tracks the amount of CPU and memory currently reserved and used, and
allows requests in, based on availability. Additionally, requests are
fair-queued and prioritized in this backpressure-model, to ensure one
client does not accidentally block out other clients access to the service.
To do this, we (naturally) have to increase the number of threads and
file-descriptors allowed with ulimit. In modern systems (64-bit virtual
memory) and OS:es, 10s or even 100s of thousands of threads can be
efficiently scheduled. For our applications, this is more than enough. My
point is that "N-bounded threadpool" is not the _only_ way to do
backpressure, in some cases even an unfeasible one.
That said, given that this suggested bounded threadpool would be _only_ be
used to _reading the request_, I think our usecase would be fine. (Or,
assuming the threads are constructed on-demand, we'd just have to set it to
a high number).
Den fre 11 mars 2022 kl 23:49 skrev Michael Leonhard <
***@***.***>:
… @rawler <https://github.com/rawler> Your point about DoS is important.
Handling DoS attacks is extremely complicated. It requires sophisticated
monitoring systems, real-time configuration management, trained technicians
on call, and a lot of well-tested code in the Http endpoint. That code does
not belong in tiny-http server, because then it wouldn't be tiny.
I'm relying on a managed load balancer service to handle DoS attacks. I
need tiny-http to handle moderate load spikes without falling over.
There are two ways to handle request overload: load shedding and
back-pressure. Load shedding is suitable only for clients that
automatically retry. Some of my clients are web browsers which do not
retry. That's why I need tiny-http to apply back-pressure on the load
balancer. A process applies back-pressure by reading requests slower when
the load is too high. The simplest way to accomplish that is to make the
dispatcher wait for an available thread.
One could argue that I should just configure the load balancer to limit
the number of concurrent forwarded requests. Most managed load balancer
services do not support that. Even when that option is available, it is
less desirable than back-pressure because the operator must manually change
the load balancer config every time the backend capabilities change. Manual
steps allow for mistakes. With tiny-http's current behavior, a bad config
means the server will fall over at the first load spike. Automating the
configuration change is possible but expensive. With back-pressure, the
server automatically informs the load balancer of its capabilities, in
real-time. This simplifies operations.
—
Reply to this email directly, view it on GitHub
<#221 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADLXE5TDC7ER2QRFGPAVTU7PEXJANCNFSM5O5YKXRA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@rawler I love how GitHub has finally processed your mail reply 7 months late. This is a generally 'hard problem' for thread-per-request type frameworks to handle. An argument can be made that @mleonhard would you propose a pool with some large number of threads by default, or take the more traditional approach of a smaller |
I'm a fan of explicit config. Just make the user pick how many threads to use. Even better, let them pass their own threadpool, so they can use it for other things. That's how I wrote Servlin. You could use safina-threadpool, which I wrote. |
Server:: from_listener
creates autil::TaskPool
which has no limit on the number of threads. This makes tiny-http trivially vulnerable to thread exhaustion. Combined with #220, this behavior means that a single overload causes total and permanent server failure.The text was updated successfully, but these errors were encountered: