Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for aggregating all inbound data before calling server Ring handler #692

Open
KingMob opened this issue Oct 9, 2023 · 2 comments

Comments

@KingMob
Copy link
Collaborator

KingMob commented Oct 9, 2023

About

The official Ring spec states that the body of a request should be an InputStream, and Aleph complies with this (when not in raw mode).

However, InputStreams aren't always desirable, because they're a blocking interface.

If no data is available (and you haven't reached EOS), any read calls will block. Block on enough of these in your Ring handlers, and you can run out of threads/resources. This limitation propagates to many common transformations of your InputStream, including slurp and bs/to-string.

One way around this is to let Aleph/Netty accumulate all the body ByteBufs before calling the user handler with the Ring request map. Netty won't consume extra threads for this, though of course, memory will be consumed.

Aggregation currently happens automatically if you set the max-request-body-size option, but that's an implementation detail. This issue aims to be more explicit.

Proposal

Add an option like aggregate-data?, wait-for-body-to-finish?, or whatever, which adds aggregation. For HTTP1, it would be the HttpObjectAggregator. For HTTP2, it'll use the custom aggregator used for the body size limit.

While we shouldn't always force a size limit, we must still apply some limits, or a trivial DoS attack will be to send an infinite stream to a server, and force it to consume all memory.

We can either apply a size limit, a time limit, or both. If the aggregation option is set, so must one of the limit options. We can
stick with max-request-body-size, but we should probably add a new one for a time limit. Maybe max-request-time or max-execution-time. We already have a bunch of timeouts on the client side, like request-timeout and read-timeout, maybe we should call it read-timeout for consistency.

@DerGuteMoritz
Copy link
Collaborator

Maybe max-request-time or max-execution-time. We already have a bunch of timeouts on the client side, like request-timeout and read-timeout, maybe we should call it read-timeout for consistency.

👍 for naming it read-timeout for the stated reason. Also, the suggested max-request-time and especially max-execution-time could be misinterpreted to also cover the handler's execution.

Another limit that could be relevant in this context is a minimum transmission rate to counter slowloris-style attacks. See e.g. what Apache's mod_reqtimeout offers.

@KingMob
Copy link
Collaborator Author

KingMob commented Oct 9, 2023

Good point. We may also want to add response timeouts, and I think max-execution-time would be good for that. Currently, Aleph doesn't have many timeouts on the server side. I suspect other peers in the chain (like load balancers, reverse proxies, etc) might have timeouts that will apply, but aleph should still have them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants