kgo.FetchMaxPartitionBytes: increasing this option causes high memory usage #701

sbuliarca · 2024-03-29T08:35:14Z

We noticed that when increasing the value for kgo.FetchMaxPartitionBytes even to 10MB, causes our apps to use much more memory than normal. Also the throughput is affected.
You can see the difference from these metrics:

Running without any options:

Running with FetchMaxPartitionBytes to 10MB:

From the metrics you can see that when running with no options the app gets to a mean of 231MB and when running with FetchMaxPartitionBytes set to 10MB it gets to 1.91GB

I created also a benchmark for this that proves the same thing and the code used is similar to what we have in our apps. Should be also useful in fixing: https://github.com/twmb/franz-go/compare/master...sbuliarca:benchmark-fetch-max-partition-bytes?body=&expand=1&title=Added+benchmark+for+using+kgo.FetchMaxPartitionBytes

twmb · 2024-04-03T17:17:15Z

By default, producing messages uses snappy compression. The consuming side configuration is based on batch size -- whatever is in the batches (compressed data) is not inspected by the broker. Is the memory on the consumer side because you're fetching, say, 10M that then gets decompressed to a large payload?

I do see your benchmark uses rand.Read, which likely is not compressible. However, the benchmark isn't demonstrating memory usage per op, it's demonstrating how fast your network connection is such that it can consume so much data so quickly (i.e., if this client was implemented very poorly such that polling weren't fast, then fewer polls could happen per iteration and it would show lower memory usage).

sbuliarca · 2024-04-11T08:56:39Z

Is the memory on the consumer side because you're fetching, say, 10M that then gets decompressed to a large payload?

Indeed, but isn't it the same also when not using the kgo. FetchMaxPartitionBytes ?

However, the benchmark isn't demonstrating memory usage per op, it's demonstrating how fast your network connection is such that it can consume so much data so quickly

The benchmark was meant to show the difference between running the code with and without the option, on the same machine. I added more stats to the benchmark in this commit sbuliarca@1cb32cc
and comparing the code runs (taking one row from the each 15 runs that mostly look the same ) :

without FetchMaxPartitionBytes:
BenchmarkPollRecords 1 3380654666 ns/op 59720 compressedbytes/op 97.00 recs/op 994865 uncompressedbytes/op 132888208 B/op 13815 allocs/op
with FetchMaxPartitionBytes:
BenchmarkPollRecords 1 3600582375 ns/op 59720 compressedbytes/op 97.00 recs/op 994865 uncompressedbytes/op 655721024 B/op 61965 allocs/op

As you can see in both cases it reads mostly the same amount of compressed bytes (seems rand.Read is compressed to 6% after all), the number of records/op is the same, the only thing that is different is the no of bytes & allocs /op:

132888208 B/op vs 655721024 B/op
13815 allocs/op vs 61965 allocs/op

Following the code I don't know if there's much you can do in the lib, as I see it gets straight to the fetch request to kafka: https://github.com/twmb/franz-go/blob/master/pkg/kgo/source.go#L551

twmb · 2024-05-23T05:59:45Z

I agree, I don't think there's much I can do in the library itself. Due to potential message compression -- and like 4 different knobs to control how much you consume at once -- getting memory usage better may require some tuning / playing with settings. There's some docs about adjusting the number of concurrent fetches, each fetch size, and each fetch partition size -- let me know if those don't suffice. I'm going to close this for now though since I'm on the side of nothing to be done here.

twmb added the waiting for details label Apr 3, 2024

twmb closed this as completed May 23, 2024

twmb removed the waiting for details label May 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kgo.FetchMaxPartitionBytes: increasing this option causes high memory usage #701

kgo.FetchMaxPartitionBytes: increasing this option causes high memory usage #701

sbuliarca commented Mar 29, 2024 •

edited

twmb commented Apr 3, 2024

sbuliarca commented Apr 11, 2024

twmb commented May 23, 2024

kgo.FetchMaxPartitionBytes: increasing this option causes high memory usage #701

kgo.FetchMaxPartitionBytes: increasing this option causes high memory usage #701

Comments

sbuliarca commented Mar 29, 2024 • edited

twmb commented Apr 3, 2024

sbuliarca commented Apr 11, 2024

twmb commented May 23, 2024

sbuliarca commented Mar 29, 2024 •

edited