-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48329][SQL] Enable spark.sql.sources.v2.bucketing.pushPartValues.enabled
by default
#46673
Conversation
sql/core/src/test/scala/org/apache/spark/sql/connector/KeyGroupedPartitioningSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AS-IS PR title seems to be partial because this PR changes two configurations.
[SPARK-48329][SQL] SPJ: Default spark.sql.sources.v2.bucketing.pushPartValues.enabled to true
Also, please resolve the conflicts and revert the test coverage change, @szehon-ho .
18481ba
to
332adfe
Compare
…artValues.enabled" to true
332adfe
to
d1f7436
Compare
Thanks @dongjoon-hyun sorry the pr was not ready. I was trying to integrate the changes from @superdiaodiao who I asaw also made a pr for the same, so we can be co-authors. Reverted the additional config change and test case, will check the test result. |
I will close my PR and you can continue, let's co-author this time. |
Thanks! @dongjoon-hyun @sunchao can you take another look? |
spark.sql.sources.v2.bucketing.pushPartValues.enabled
by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @szehon-ho and @superdiaodiao
Merged to master for Apache Spark 4.0.0.
Thank you~~~ |
BTW, @szehon-ho and @superdiaodiao . The umbrella JIRA is closed one already for Apache Spark 3.3.0. I moved SPARK-48329 to a subtask of SPARK-44111 for now. For the other new tasks, please create a new umbrella JIRA and use it. |
OK |
@dongjoon-hyun do you have any guidance how we may have a new Spark 4.0+ tracking JIRA to link all the new SPJ items? I think it will be nice to have all of them in one list. |
Feel free to create a new umbrella Jira issue with a proper meaningful title instead of duplicating old Jira issue title. Then, link it to SPARK-44111, @szehon-ho . That's enough. |
What changes were proposed in this pull request?
This PR aims to enable
spark.sql.sources.v2.bucketing.pushPartValues.enabled
by default for Apache Spark 4.0.0 while keepingspark.sql.sources.v2.bucketing.enabled
isfalse
.Why are the changes needed?
spark.sql.sources.v2.bucketing.pushPartValues.enabled
was added at Apache Spark 3.4.0 and has been used as one of the datasource v2 bucketing feature. This PR will help the datasource v2 bucketing users use this feature more easily.Note that this change is technically no-op for the default users because
spark.sql.sources.v2.bucketing.enabled
isfalse
still.Does this PR introduce any user-facing change?
No
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
No