Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: Use Klaus Post's compression libraries #1484

Open
philpearl opened this issue Apr 15, 2024 · 5 comments
Open

nsqd: Use Klaus Post's compression libraries #1484

philpearl opened this issue Apr 15, 2024 · 5 comments

Comments

@philpearl
Copy link

We would quite like to use compression with NSQ to save on data transfer costs, but the CPU impact is higher than we'd like. Our experiments have shown that Klaus Post's compression libraries perform much better than the standard library Deflate and Google's Snappy, with the sweet spot appearing to be level 3 flate compressing our traffic to about 25% of its original size, but only incurring a CPU cost equivalent to Snappy.

Would there be any interest in taking a PR that makes this change?

@philpearl philpearl changed the title Use Klaus Post's compression libraries nsqd: Use Klaus Post's compression libraries Apr 15, 2024
@mreiferson
Copy link
Member

At a glance, I don't see any fundamental problem with improving performance by swapping out the dependency. Should we also expose the other compression algorithms, too?

@adamroyjones
Copy link

I think that's an excellent idea. zstd in particular has a strong appeal to it.

It's especially appealing with NSQ as it seems like a common pattern is for a topic to have messages with a single, well-defined schema—variations on a theme. Dictionaries (as generated by zstd --train) could be very useful.

@philpearl
Copy link
Author

Fabulous. I'll put together a PR for the dependency swap.

@philpearl
Copy link
Author

Hmm, I think our original testing must have been flawed when looking at Snappy. The Klaus Post version of this seems to be slower than the Google version, and the Klaus Post Deflate doesn't reach the speed of Snappy at any level.

This is what I'm getting comparing replacing Snappy & Deflate in both NSQD and the Go NSQ library

name                  old time/op    new time/op    delta
Compress/snappy-16       272µs ± 2%     310µs ± 7%  +13.96%  (p=0.000 n=10+10)
Compress/deflate3-16     746µs ± 1%     612µs ± 2%  -17.91%  (p=0.000 n=10+10)
Compress/deflate5-16    1.06ms ± 1%    0.66ms ± 1%  -37.60%  (p=0.000 n=10+9)
Compress/deflate6-16    1.28ms ± 2%    0.73ms ± 5%  -43.46%  (p=0.000 n=9+10)
Compress/deflate9-16    1.47ms ± 4%    1.72ms ± 9%  +16.33%  (p=0.000 n=10+9)

There's also an added complication that the Klaus Post Deflate compresses a little less at most levels.

=== RUN   TestCompareDeflate
    protocol_v2_test.go:2056: deflate level 1: compress to 19.304255% - 98.069603% of Go deflate
    protocol_v2_test.go:2056: deflate level 2: compress to 18.701276% - 104.756670% of Go deflate
    protocol_v2_test.go:2056: deflate level 3: compress to 18.185710% - 103.829701% of Go deflate
    protocol_v2_test.go:2056: deflate level 4: compress to 16.835251% - 105.500279% of Go deflate
    protocol_v2_test.go:2056: deflate level 5: compress to 16.005709% - 103.914756% of Go deflate
    protocol_v2_test.go:2056: deflate level 6: compress to 15.584694% - 103.055326% of Go deflate
    protocol_v2_test.go:2056: deflate level 7: compress to 15.575774% - 103.398863% of Go deflate
    protocol_v2_test.go:2056: deflate level 8: compress to 15.233253% - 101.558040% of Go deflate
    protocol_v2_test.go:2056: deflate level 9: compress to 14.929979% - 99.571684% of Go deflate
--- PASS: TestCompareDeflate (0.02s)

I still think it's worth replacing the Deflate library, but the motivation is much less than I previously believed. WDYT?

@mreiferson
Copy link
Member

Meh, doesn't seem worth it? It sounds like we're saying "just use snappy"?

We should land all the benchmark code improvements (I've pushed a few more up to your PR), and nsqio/go-nsq#362 though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants