You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(Formerly, #5899 tracked throttling for both read and write path. We moved the write path throttle here to be able to reduce scope & be able to close #5889)
Motivation
walingest is compute intensive & affects shared resources (TX to S3, global rate limits to S3, etc). A single tenant shouldn't be able to exhaust it.
we still should have back-pressure as a defense-in-depth mechanism, but, that's separate from throttling
Pageserver artificially caps the per-tenant throughput on the write path (=ingest).
I.e., to all upstream Neon components, this cap will appear to be the maximum ingest performance that you can get per tenant per pageserver.
Like with #5899, the limit will be chosen such that a TBD (small single-digit) number of tenants can run at the limit. Discovery of the limit values is done through gradual rollout, conservative experimentation, and informed by benchmarks.
There is enough observability to clearly disambiguate slowness induced by limiting from slowness caused by otherwise slow pageserver. This disambiguation must be on per-tenant (better: per-timeline) granularity.
The limits are on-by-default and cannot be permanently overridden on a per-tenant basis.
I.e., the implementation need not be suited for productization as "performance tier" or "QoS" feature.
TBD: specify how exactly the backpressure is propagated to SKs and Computes. The current "max lag" is insufficient; it's a hard limit.
Interactions
Sharding: with sharding, the throttling happens per shard instead of per tenant. Exactly like in #5899.
High-Level Plan
The content you are editing has changed. Please copy your edits and refresh the page.
(Formerly, #5899 tracked throttling for both read and write path. We moved the write path throttle here to be able to reduce scope & be able to close #5889)
Motivation
DoD
Pageserver artificially caps the per-tenant throughput on the write path (=ingest).
I.e., to all upstream Neon components, this cap will appear to be the maximum ingest performance that you can get per tenant per pageserver.
Like with #5899, the limit will be chosen such that a TBD (small single-digit) number of tenants can run at the limit. Discovery of the limit values is done through gradual rollout, conservative experimentation, and informed by benchmarks.
There is enough observability to clearly disambiguate slowness induced by limiting from slowness caused by otherwise slow pageserver. This disambiguation must be on per-tenant (better: per-timeline) granularity.
The limits are on-by-default and cannot be permanently overridden on a per-tenant basis.
I.e., the implementation need not be suited for productization as "performance tier" or "QoS" feature.
TBD: specify how exactly the backpressure is propagated to SKs and Computes. The current "max lag" is insufficient; it's a hard limit.
Interactions
Sharding: with sharding, the throttling happens per shard instead of per tenant. Exactly like in #5899.
High-Level Plan
Write Path
The text was updated successfully, but these errors were encountered: