New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose additional compression levels #4728
Comments
Hey @ebattenberg, did you run tests on an SSD or HDD? Also, it would be great if you shared the benchmark results over here to get some interest from the community. |
I'm happy to provide more info on the rough benchmarks. I thought my original post was getting long enough, so I just provided the raw zstd benchmarks that are restic repo agnostic, thinking that those very noticeable differences would be enough to motivate additional compression levels. I also ran my hacked-up fork using the 4 compression levels on a variety of restic backends. Here's some more details on those results.
I'll share the specific results via a Google spreadsheet if that's cool: Restic Compression Levels Bench Comments:
I'm sure the two additional compression levels would be useful for some subset of restic users given the variety of user preferences, use cases, and system configurations out there. I'd be glad to share a cleaned up version of my fork or take a stab at a proper PR if anyone is interested. |
I should add that I became aware of these additional zstd compression levels when I was testing out kopia. Kopia provides a compression benchmarking tool for all of the different compression algorithms it supports, and its compression docs make the simple point that """ In this way, providing additional compression levels allows users to configure higher compression within the "free" throughput range for a wider variety of CPU/backend combinations. |
Output of
restic version
restic 0.16.4 compiled with go1.21.6 on linux/amd64
What should restic do differently? Which functionality do you think we should add?
It would be great if restic exposed the additional compression levels offered by the klauspost/compress library.
What are you trying to do? What problem would this solve?
The current 'auto' and 'max' levels map to the
SpeedDefault
andSpeedBestCompression
levels, respectively, in the klauspost/compress library, and while these are great to have, the 'max' setting occupies a fairly extreme point on the compression:speed tradeoff curve. It's useful for remote repos when upstream bandwidth is limited (<10Mbps), but can significantly slow down transfers over faster connections. The klauspost/compress library offers an intermediate mode (SpeedBetterCompression
) that might be more useful to people who want a bit more compression thanSpeedDefault
but who don't want to go all the way to the orders-of-magnitude-slowerSpeedBestCompression
. Additionally,SpeedFastest
might be a desirable configuration for people who are doing local disk backups and care most about speed but still want some compression.I've benchmarked the various zstd compression levels (1-22) on my (dinky) Celeron N5095 and noticed:
SpeedFastest
) runs at around 500-2000 MB/s.SpeedDefault
/ 'auto') runs at around 300-1500 MB/s and achieves approximately the same compression ratios as Level 1.SpeedBetterCompression
) runs at around 50-100 MB/s and achieves 10-30% better compression ratios than Level 1.SpeedBestCompression
/ 'max') runs at around 0.8-3 MB/s and achieves 20-70% better compression ratios on compressible files than Level 1.These are all pretty significant differences, especially with respect to speed, and since people use restic with a variety of backends on a variety of hardware, it seems like more control over the compression levels would be useful.
I tested the two extra modes ("better" and "fastest") inside restic by adding the appropriate enums and switch cases in internal/repository/repository.go and they seem to produce backup speeds and repo sizes proportional to what would be expected given my zstd benchmarks above.
Would exposing these additional compression levels be useful to the restic community? Does doing so jive with the general philosophy of the project? I'm sure this was considered at some point when compression was initially added, but I had trouble finding relevant discussions.
This is the first time I've touched golang or restic code. If there's interest, I'm happy to take a stab at a proper PR, but I wanted to get a sense for whether this was worth pursuing first. I imagine this is something an experienced restic contributor could knock out in 20 minutes (which is why I think there might be philosophical issues with the proposed changes or something else I'm not considering).
Did restic help you today? Did it make you happy in any way?
I'm pretty new to restic, but I'm currently pretty pumped about it and I especially like using it with resticprofile to automate and organize all of my backups. It also did my taxes for me today and I even saw it help an old lady cross the street.
The text was updated successfully, but these errors were encountered: