New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer fetch request priority #5213
Comments
In the "better" hardware scenario this make sense to me. I wonder about the region case though, imagine 3 regions and a stream+consumer in the first. Clients would need knowledge of distance to the consumer and then that knowledge needs to be coordinated among them to figure out who is priority 1,2 or 3 - all clients connected to the 2nd nearest cluster share the same priority. But what if my RTT to the cluster went high, it would need to be dynamic based on my RTT to the cluster AND the clusters RTT to the consumere somehow I dont think it would be elegant to hard configure those priorities... did you want to hard coding those priorities or someting else? |
In the use case this would be required for they are ok with the priority to be hard coded because they know which data center (cluster) the client application doing the fetch is located and which data center(s) the stream(s) they are fetching on are located. Sure the actual RTT between two data centers could fluctuate a bit over time but they don't care about that because they can estimate what that RTT is normally and use that to determine 'how far' the data center is. What's important is that the local fetchers use the highest priority and the 'remote' fetchers use a lower priority, they do not want to adjust the priority dynamically according to the current RTT, the priorities are set 'administratively'. For example there's an LA data center and a NY data center. (all the) Clients in LA fetch with priority 1 from the stream located in LA and with priority 0 from the stream located in NY, and vice-versa (all the) clients in NY would fetch from the NY stream with priority 1 and from the LA stream with priority 0 |
It sounds a bit too manual and human driven to me tbh as a general solution |
I mean the immediate comparison here is queue groups, imagine how much queue groups would suck if it was all manually configured? |
In this case would push queue groups make sense? |
Yeah that's a comment I had earlier elsewhere, but be nice to get a pull equivelant that isnt just completely up to users to manaully build out - but of course good to support the manual case for the different hardware type scenario. Does the shared info cross the import include the client RTT? That might translate exactly to priority - higher RTT lower priority - but doesnt consider cluster rtt |
It has it yes, but that is for the client connection only, not the full RTT to reach the consumer leader. |
It also has cluster I think, so maybe those two combined.. |
Worth some exploration for sure - if fetch has a priority take that. Else of consumer is set to suto prioritise use that combo. We know of users have to configure these things the only valid assumption is that it’s done wrong :) |
Curious though is this really necessary? Do we service the waiting pulls in random order? I thought we do that to avoid a bad client doing a huge pull and then just letting ack timers expire that could DOS a consumer. Assuming we service the waiting pulls in random order this is achievable by adjusting the rate of pulls and size of batches. Pull fewer messages less frequently and over time you get a fraction of total messages. |
We do FIFO but allocate 1 per request even if batch is > 1. |
Use case is you have a bunch of clients in each data center pulling to get messages (work requests) from the working queue stream for that data center. But if there are a lot of messages to process in the stream they want to allow clients located in other data centers that are not busy and waiting for new messages to land in their data center's working queue to "steal" work from this data center. Hence clients in the data center pull from the working queue stream in their own data center with a high priority and as long as they can keep up with the ingress of messages they are the ones mostly getting those messages. But if they are busy and messages are coming into the stream faster than they pull them then the other clients from the other data centers are are pulling with lower priority would start getting some of those messages delivered to them. (This idea here was presented to them as a possible way to implement that use case and they agreed it would work for them but if we come up with another better way to do what they want then even better. This idea seemed a simple and doable relatively easily (and backwards compatible as client applications that know nothing about this priority would just end up using priority 0)). |
Maybe I'm wrong here, but for that mechanism to work correctly, the high-priority clients have to be very conservative in sending pull requests. If they will always have pending Pull Request, which is the most common use case, the lower priorities will never get messages anyway. The only way it will work reliably, is if it's used with fetch of 1 messgage, and the fetch is send only after the previous one finished. This is pretty inefficient way to interact with Pull Consumers that also causes zig-zags in network traffic. The use case @jnmoyne mentions maybe needs more like a watermark solution, where specific pull request can set watermark of pending messages, above which it will be served? |
In this particular use case indeed there's lots (hundreds, up to thousands) of client application in each data center each one wanting to fetch indeed just one message which takes a few seconds to process, and then go back to fetching one message. They are not trying to spread the distribution of messages proportionally to the priority level such that lower priority fetchers still get some messages, rather the high priority fetchers that are waiting for new messages should always get the messages when they come in. The lower priority fetchers should always be 'starved' as long as there are some high priority fetch requests waiting and only get a new message if at the time there is no higher priority fetcher waiting. |
I understand the need, but I really do not like the idea that such an important feature works only if there is very specific setup done by the users. |
I like the proposal, having clients indicate themselves if they have prio1 to receive the data, and if they are willing to have lower prio2 clients take up the work if prio1 is saturated. Am thinking the priority alone might be a bit too manual, and would agree with @Jarema there. The question I would ask myself as an end-user of this: "If I have clients having prio1 and some prio2, when can I expect prio2 clients to receive work?". Currently that sounds like it would be specific to the setup, being tied to the implementation in the server as well. I'm thinking a slight addition to the priorities might make the setup a bit less specific, as well as having a more clear answer to "when can lower prio clients expect to receive work". What if we'd be able to indicate both prio and some kind of "timeout". An example: Then the server would always hold on to the highest priority that last pulled as well as a timeout. When the timeout is reached the priority can be lowered to see if other clients are available. |
Proposed change
Introduce support for a priority value (an integer) that can be (optionally) set on a consumer fetch request that the server then uses to determine which requests to serve messages to.
For example, if three requests with priority 1 and three requests with priority 0 are pending, the priority 1 requests will be fulfilled first before priority 0 (and assuming no new requests with priority 1 interleave).
Use case
There are two general use cases:
Contribution
cc @jnmoyne
The text was updated successfully, but these errors were encountered: