-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Enable local sort-based shuffle #5811
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/apache/incubator-gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
eb0e683
to
334c182
Compare
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
0331da5
to
58284b0
Compare
Run Gluten Clickhouse CI |
The UT failure of commit 56580aa is due to metrics changed which causes AQE suites failure https://github.com/apache/incubator-gluten/actions/runs/9185004959 |
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
This is an experimental feature. In TPCH benchmark, we observed significant performance drop and higher memory pressure
using default shuffle partitions (2x~4x vcore#) comparing with hash-based shuffle. The sort-based shuffle will be enabled if shuffle partitions is greater than
spark.gluten.sql.columnar.shuffle.sort.threshold
(default value 100'000). It's recommended not to enable local sort-base shuffle until we fix the performance.