Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to limit the speed of tiered storage #520

Open
funky-eyes opened this issue Mar 21, 2024 · 7 comments
Open

How to limit the speed of tiered storage #520

funky-eyes opened this issue Mar 21, 2024 · 7 comments

Comments

@funky-eyes
Copy link

What can we help you with?

When reading and writing in real time, whenever the action of uploading s3 is triggered, it will lead to a very large disk io consumption, cpu instantly rises, resulting in the impact of the production of the business operation, I tried to adjust remote.log.manager.thread.pool.size=cpu*2 but the effect is still not very ideal
image
image

Where would you expect to find this information?

@funky-eyes
Copy link
Author

I want to do speed limit in S3MultiPartOutputStream#write, I wonder if the community can agree to this solution, I can mention pr to support it

@ivanyu
Copy link
Contributor

ivanyu commented Mar 21, 2024

It seems you will need KIP-956, which hasn't been accepted yet.

@funky-eyes
Copy link
Author

It seems you will need KIP-956, which hasn't been accepted yet.

@ivanyu Yes, I need this feature, but I found that I can actually achieve rate limiting in the implementation code of copysegment. So why don't we implement this feature ourselves? Just like how the community provides compression for copysegment.

@ivanyu
Copy link
Contributor

ivanyu commented Apr 18, 2024

Yeah, I see your point. We're looking into your PRs and other ways to implement this

@funky-eyes
Copy link
Author

@ivanyu
apache/kafka#15625
I've noticed this standard RLMQuotaManager, but it seems that we still need to adapt and use it because it appears to be just a standard definition.

@jeqo
Copy link
Contributor

jeqo commented May 21, 2024

@funky-eyes I have been looking into how quotas are implemented and it's not going to help us that much on the upload path as the granularity is at the segment level. It will not help on how fast a segment is actually read and uploaded. For the fetch part it will be more effective I think.

I've looked into some native support on the SDKs for this, and apart from some readLimit config from AWS SDK v1, I haven't found anything similar in V2 or other vendor SDKs.

I've took a look at #521 and I like the idea of using bucket4j (first time I've seen it; thanks for bringing it to our attention). However I'd like to see if we can have the rate limiting a bit higher in the stack so it can benefit other SDKs, not only S3.
I have started a PR to evaluate this approach: #548 -- would be great if you can have a look and check if brings similar benefits as your PR. Thanks!

@funky-eyes
Copy link
Author

@funky-eyes I have been looking into how quotas are implemented and it's not going to help us that much on the upload path as the granularity is at the segment level. It will not help on how fast a segment is actually read and uploaded. For the fetch part it will be more effective I think.

I've looked into some native support on the SDKs for this, and apart from some readLimit config from AWS SDK v1, I haven't found anything similar in V2 or other vendor SDKs.

I've took a look at #521 and I like the idea of using bucket4j (first time I've seen it; thanks for bringing it to our attention). However I'd like to see if we can have the rate limiting a bit higher in the stack so it can benefit other SDKs, not only S3. I have started a PR to evaluate this approach: #548 -- would be great if you can have a look and check if brings similar benefits as your PR. Thanks!

I apologize for only seeing your message now. First of all, thank you for reviewing my requirement and PR. I'm also pleased to see that you used bucket4j for rate limiting in #548. I took a preliminary look at your PR, and you've wrapped the InputStream with RateLimitedInputStream, controlling the read rate to solve the problem, while I controlled the write rate to address the issue. However, both have similar effects, and I prefer your implementation. Initially, I also thought about how to make it more generic, but considering that my project is about to go live, I submitted a rudimentary version to the community. I will test your PR next, and I hope the community can merge this #548 feature before the official version is released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants