Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checksum perf in rocksdb #11716

Closed
patelprateek opened this issue Aug 18, 2023 · 4 comments
Closed

checksum perf in rocksdb #11716

patelprateek opened this issue Aug 18, 2023 · 4 comments

Comments

@patelprateek
Copy link

during profiling of rocksdb instance in production i observe checksums consuming non negligible cpu
I use kCRC32c
Are alternatives xxHash variants more performant

Since our workloads are divided into write only and read only instances , is it possible to enable checksum onnly on write instances , and turn it off for read instances ?
Cana. read only instance DB data get corrupted as well ? Is there a way to do a quick data integrity check during opening DB and turn it off for all queries

@jsteemann
Copy link
Contributor

We moved from kCRC32c to kxxHash64 a while ago as it provided better performance on the hardware we run it on.
Results may vary depending on hardware though.

You can turn off checksum comparisons in the read-only instance by setting verfiy_checksums = false in the rocksdb::ReadOptions instance you use for point lookups or iterator range scans. This turns off checksumming for these operations.
Checksumming may still be useful on a read-only instance, because even if you are not writing, the filesystem or the hardware may have issues that can ultimately lead to corruption and checksum errors.

@patelprateek
Copy link
Author

@jsteemann : thanks for the info. Can you please elaborate a bit on hardware specific performance . what hardware works best for kxxHash64 vs kxxHash

@jsteemann
Copy link
Contributor

@patelprateek : we did the tests about a year back, so I don't exactly remember which CPUs we used. For sure it was some x86_64 CPU, probably some AMD Ryzen 5950X, but I can't say for sure anymore.
Results may also depend on your compiler settings, especially optimization level and target architecture. So I won't say that xxHash64 will always be better than other algorithms. But for our specific use case it turned out to be better.

@patelprateek
Copy link
Author

makes sense .
Also i found someinteresting discussion here regarding CRC32c : Cyan4973/xxHash#62
which is almost 3x higher throughput

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants