Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server][da-vinci] Bumped RocksDB dep and adopt multiget async io by default #950

Merged
merged 4 commits into from May 9, 2024

Conversation

gaojieliu
Copy link
Contributor

This PR bumps up the RocksDB dep and expose a config to enable async io for multi-get and the default value is true.
rocksdb.read.async.io.enabled: default true
In theoy, with this config and posix filesystem, RocksDB multiget API will be speeded up quite a bit based on the benchmarking: https://rocksdb.org/blog/2022/10/07/asynchronous-io-in-rocksdb.html

So far, such optimization only applies to the chunk lookup for large value/rmd, and if it is proved to be more performant by checking the lookup latency for large value in the read path, we can apply such optimization in more areas:

  1. DaVinci with DISK mode.
  2. Ingestion code path by looking up entries in batch for AA/WC use cases.
  3. Regular read path. So far, multi-get API can only be used against the same RocksDB database, so we might need some re-org of RocksDB databases if this API is truly helpful.

How was this PR tested?

CI

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

majisourav99
majisourav99 previously approved these changes May 7, 2024
Copy link
Contributor

@majisourav99 majisourav99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Do we need to add any new tests or we are relying on existing WC chunking tests for this?

…default

This PR bumps up the RocksDB dep and expose a config to enable async io
for multi-get and the default value is true.
rocksdb.read.async.io.enabled: default true
In theoy, with this config and posix filesystem, RocksDB multiget API
will be speeded up quite a bit based on the benchmarking:
https://rocksdb.org/blog/2022/10/07/asynchronous-io-in-rocksdb.html

So far, such optimization only applies to the chunk lookup for large
value/rmd, and if it is proved to be more performant by checking
the lookup latency for large value in the read path, we can apply
such optimization in more areas:
1. DaVinci with DISK mode.
2. Ingestion code path by looking up entries in batch for AA/WC use cases.
3. Regular read path.
So far, multi-get API can only be used against the same RocksDB database,
so we might need some re-org of RocksDB databases if this API is truly
helpful.
@gaojieliu
Copy link
Contributor Author

LGTM! Do we need to add any new tests or we are relying on existing WC chunking tests for this?

We have existing both unit test and integration test to cover this change, so I think it is enough.

@gaojieliu gaojieliu merged commit 02591a1 into linkedin:main May 9, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants