Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud_storage: Increase disk tput limit a bit #18451

Merged
merged 1 commit into from
May 15, 2024

Conversation

Lazin
Copy link
Contributor

@Lazin Lazin commented May 13, 2024

When setting read-path tput limit increase disk limit by 6%

Fixes #14139

After the Seastar update the throttling test started to fail because disk limit is now more strict. This PR relaxes the limit a bit so the network could be throttled in the test. This should also improve observability a bit because Seastar does not expose any disk throttling metrics but we do have network throttling metrics. By having disk limit slightly higher we will be always hitting throttling in our net layer which has better metrics. There should be no impact for anything else.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

When setting read-path tput limit increase disk limit by 6%
@@ -144,8 +144,9 @@ get_throughput_limit(std::optional<size_t> device_throughput) {
}

auto tp = std::min(hard_limit, device_throughput.value());
constexpr auto tput_overshoot_frac = 16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything special about this number? Or was it empirically determined?

If empirically determined, how did you determine how effective this PR is (in case we need to adjust it further down the line)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a simple back of the envelope calculation (2GiB/s with 50% allowance for TS will give us around 60MiB/s per node)
we only need to limit disk as a last resort, the code is relying on network limit which is set per shard

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a simple back of the envelope calculation (2GiB/s with 50% allowance for TS will give us around 60MiB/s per node)
we only need to limit disk as a last resort, the code is relying on network limit which is set per shard

add to commit message?

@Lazin
Copy link
Contributor Author

Lazin commented May 14, 2024

/ci-repeat 10
tests/rptest/tests/e2e_shadow_indexing_test.py::EndToEndThrottlingTest

@Lazin
Copy link
Contributor Author

Lazin commented May 14, 2024

/cdt
tests/rptest/tests/e2e_shadow_indexing_test.py::EndToEndThrottlingTest

1 similar comment
@Lazin
Copy link
Contributor Author

Lazin commented May 14, 2024

/cdt
tests/rptest/tests/e2e_shadow_indexing_test.py::EndToEndThrottlingTest

@@ -144,8 +144,9 @@ get_throughput_limit(std::optional<size_t> device_throughput) {
}

auto tp = std::min(hard_limit, device_throughput.value());
constexpr auto tput_overshoot_frac = 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a simple back of the envelope calculation (2GiB/s with 50% allowance for TS will give us around 60MiB/s per node)
we only need to limit disk as a last resort, the code is relying on network limit which is set per shard

add to commit message?

@Lazin Lazin merged commit f3ea797 into redpanda-data:dev May 15, 2024
19 checks passed
@nvartolomei
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants