Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] support use skip list to store shuffleBuffer in memory #1708

Open
3 tasks done
xianjingfeng opened this issue May 13, 2024 · 2 comments · May be fixed by #1763
Open
3 tasks done

[FEATURE] support use skip list to store shuffleBuffer in memory #1708

xianjingfeng opened this issue May 13, 2024 · 2 comments · May be fixed by #1763

Comments

@xianjingfeng
Copy link
Member

xianjingfeng commented May 13, 2024

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the feature

Currently, we use linkedList to store shuffleBuffer in memory. If we assign a lot of memory(1TB) to the shuffle server, the performance will not be good while getting data from memory. Because every request needs to look for lastBlockId from the head position.

Other benefits of using skip list:

  1. Fix [Bug] ShuffleReadClientImpl occurs StackOverflowError. #926
  2. We don't need to sort data while flushing data to disk. [Improvement][AQE] Sort MapId before the data are flushed #137

Motivation

No response

Describe the solution

No response

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@jerqi
Copy link
Contributor

jerqi commented May 17, 2024

We need to guarantee the order of data, otherwise we will lose the data.

@xianjingfeng
Copy link
Member Author

We need to guarantee the order of data, otherwise we will lose the data.

Get it. This feature will not support slow start. I'm going to make it an optional feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants