New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues with consumers that have multiple filter subjects #4888
Comments
We can take a look, but in general that many filtered subjects might warrant a design review. Wildcards can be used to dramatically lessen the number of subjects needed. |
We are using a wildcard solution right now. But it leads to each consumer receiving a lot more messages than needed. The clients then look at the subject of the messages they get and they drop all messages they are not interested in. This works, but it creates more network traffic and load on the clients than needed. We would like to move the filtering to the server. The API allows us to do that, but the performance issues on the server side doesn't. Maybe I'm missing something obvious, but I don't see how we can implement this with less filter subjects. We have a larger number of UUIDs. For each UUID there is a small number of messages:
Client A needs the messages for a set of UUIDs, let's call it Set_A. This set has approx. 100 members. Client B needs a different set of messages, Set_B. The sets are potentially overlapping though. If they were distinct we could use a scheme like
But that is not the case, client B is also interested in some, but not all, of the UUIDs that client A is interested in. Putting the same messages multiple times on the stream (with different subjects) would blow up the stream. It would also mean that we could not any longer make a change to one message and have all interested clients receive an update. So what we are doing right now is to have each client install a single consumer with a wildcard subject that matches all messages. What we are aiming for is to have each client install a single consumer that filters for the UUIDs that it needs to know about. We already tried to have each client install a (single filter subject) consumer per UUID. That means we ended up with about 100 consumers per client, that performed even worse. Note that the stream info shown above is from a small test setup. The profiles I have attached are also taken from that test setup. I can already see the NATS server falling behind in the logs from that small test setup. Our actual use case in production has more consumers (approx. 3.600) and more messages (approx. 30.000) on that stream. And we want to be able to support even larger setups. |
Designing subject spaces can be challenging. The best results we have seen come from working closely with our partners and customers on the initial design of the system to achieve their goals. They bring the domain expertise and desired outcomes and the Synadia team brings their expertise in distributed systems and NATS.io tech stack. |
We have created a benchmark that allows to easily reproduce the problem reported here: |
@svenfoo Thanks! Will take a look into it. |
We have very similar problem as described above. Each of our servers is interested in 100-350 subjects (The subjects are random on the consumer, and the producer doesn't know where to send them). In total there is about 20_000 different subjects. CPU is pretty high with one KV, however the plan was to use about 10 different Key-value streams. The payload is relative small I thing median being at around 2kiB |
@mvrhov why are you using consumers with KV? KV Gets use direct get mechanisms that avoid consumers all together. These are being used with millions of subjects. |
@svenfoo Can we switch to email? derek@synadia.com. Thanks. |
The consumer needs to be notified of the specific KV change that happen So we are filtering on the key changes. And as consumers can crash the consumer that will take over should be able to get last value. (Well we actually store the last two per key as they might be needed for debugging purposes.) |
Are you using KV watchers? |
@mvrhov can you expand on "CPU" is pretty high with one KV and on how your use case uses the subjects, like how many KV watchers you are creating, what's the update rate? |
We go under the hood an do a watch on the jetstream directly as we need to update the list of keys (via
we did use the Watch before and the CPU usage was even worse. We created only one KV with about 15_000 keys. Update rate for those keys is once every 3-15 minutes. In total we wanted to have 4 times KV, 3 times PubSub, 2 times Work Queue. Most of those are very very spiky in nature. The number of Keys/items in each of those is approx 15_000. And it would rise approx 5_000 items per year. IMHO this are pretty small numbers and this shouldn't have such a high CPU usage. We don't want to scale this as with GRPC and Redis everything works. But we'd like to change architecture but not at the expense of using more resources because of that. With watch we had 41k subscribers for one KV and with going to Filtered subjects we have 40 subscribers for one KV. edit: It's way better with filtered. And I had a plan to test this further in the next 10 days, but I've stumbled across this. And it looks just like we observed. |
Preferred method is KV watchers. What client language and version? |
With KV watchers we used pkg.go.dev/github.com/nats-io/nats.go@v1.28.0 and NATS 2.10.1 Now with filtering its pkg.go.dev/github.com/nats-io/nats.go@v1.21.0 and NATS 2.10.4 I all tests the go version is always 1.21.x |
Please upgrade to 2.10.7 (2.10.8 will be released today and we encourage all users to upgrade to that once it lands). |
Also nats.go latest is 1.31.0.. |
Will be done automatically over the next few days as we are doing a release soon. |
Ah it was a typo of course we use 1.31 and not 1.21, I copy pasted the string and changed only one number |
This change introduces a new LoadNextMsgMulti into the store layer. It is passed a sublist that is maintained by the consumer. The store layer matches potential messages across any positive match in the sublist. Resolves: #4888 Signed-off-by: Derek Collison <derek@nats.io>
For whatever it is worth, I am also planning to use a JS consumer with filter subjects to "watch" many KV subkeys, rather than use multiple KV Watchers. (though I'll benchmark both when I implement it). There will be many users in total, but fewer (on a relative basis) connected at any given time, so it doesn't make sense to watch the majority of users who are not connected (but whose keys will be changing based on things that others do). And I cant think of any other way that this could be modeled than by user ID, so as to enable a singular wildcard KV watch. And I have to figure it should be more performant to create a single JS Consumer filterSubjects than a new KV Watch connection for each user. |
Observed behavior
We are experiencing performance issues on the NATS server when creating consumers with many (on the order of 100) filter subjects.
To give you an idea of the setup, here's the stream information:
If we create the 36 consumers on that stream with a single filter subject such as
config.beam-instance.*.>
so that each consumer gets all the messages, performance is okay. As you can see above the stream only has a single message per subject, and the consumers are usingDeliverLastPerSubjectPolicy
. So all in all 8.307 messages are sent to each of the 36 consumers. I've taken a profile that covers the consumer subscription and the delivery of all messages:Now we would like to avoid sending all messages to all consumers. Each consumer actually only needs a subset of the messages, on average about 1000 messages. So we changed the consumer subscription so that each consumer subscribes using around 100 filter subjects. Each filter subject has a wildcard, they look like this, for example:
config.beam-instance.44742b24-9b67-11ee-800a-b445062a7e37.>
. So instead of going for all UUIDS by using a '*`, we select the messages for the about hundreds UUIDs that the consumer is actually interested in.We expected that this would improve the performance and reduce the load on the NATS server that is leading this stream. Instead we measured that the performance degraded and we see that the NATS server is struggling to keep up. According to the server logs it is falling behind while it is handling the subscriptions:
I've also taken a profile of this situation, and it shows a very different picture:
As you can see the server now uses a lot more CPU cycles and a good deal of the time is spent in the function
tokenizeSubjectIntoSlice()
. This is what I am hoping to be able to address with the changes proposed in #4886Expected behavior
We expected to see a performance win by doing more fine-grained subscriptions.
Server and client version
Server v2.10.7 using the
2.10.7-alpine
imageGO client v1.31.0
Host environment
Linux 6.4
Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz
Steps to reproduce
No response
The text was updated successfully, but these errors were encountered: