Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48286] Fix analysis and creation of column with exists default expression #46594

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

urosstan-db
Copy link
Contributor

@urosstan-db urosstan-db commented May 15, 2024

What changes were proposed in this pull request?

Pass correct parameter list to org.apache.spark.sql.catalyst.util.ResolveDefaultColumns#analyze when it is invoked from org.apache.spark.sql.connector.catalog.CatalogV2Util#structFieldToV2Column.

org.apache.spark.sql.catalyst.util.ResolveDefaultColumns#analyze method accepts 3 parameter

  1. Field to analyze
  2. Statement type - String
  3. Metadata key - CURRENT_DEFAULT or EXISTS_DEFAULT

Method org.apache.spark.sql.connector.catalog.CatalogV2Util#structFieldToV2Column
pass fieldToAnalyze and EXISTS_DEFAULT as second parameter, so it is not metadata key, instead of that, it is statement type, so different expression is analyzed.

Pull requests where original change was introduced
#40049 - Initial commit
#44876 - Refactor that did not touch the issue
#44935 - Another refactor that did not touch the issue

Why are the changes needed?

It is needed to pass correct value to Column object

Does this PR introduce any user-facing change?

Yes, this is a bug fix, existence default value has now proper expression, but before this change, existence default value was actually current default value of column.

How was this patch tested?

No special test added, because change is simple.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label May 15, 2024
@@ -525,7 +525,7 @@ private[sql] object CatalogV2Util {
}

if (isDefaultColumn) {
val e = analyze(f, EXISTS_DEFAULT_COLUMN_METADATA_KEY)
val e = analyze(f, statementType = "Column conversion", EXISTS_DEFAULT_COLUMN_METADATA_KEY)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Statement type is used only for debugging/logging, the main purpose of it is to be present in exception message being thrown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether columns method of Table is called only on creation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At any rate, this looks like the right fix!

@urosstan-db urosstan-db changed the title [SPARK-48286] Fix analysis of default expression in struct field to column conversion method [SPARK-48286] Fix analysis and creation of column with exists default expression - Struct field to column conversion method May 15, 2024
@urosstan-db urosstan-db changed the title [SPARK-48286] Fix analysis and creation of column with exists default expression - Struct field to column conversion method [SPARK-48286] Fix analysis and creation of column with exists default expression May 15, 2024
@cloud-fan
Copy link
Contributor

@urosstan-db can you re-trigger the test?

@urosstan-db
Copy link
Contributor Author

@cloud-fan Sure, sorry, I did not see this failed

@cloud-fan
Copy link
Contributor

Can you retry the github action job? Seems flaky

@cloud-fan
Copy link
Contributor

can we add a test? Besically any default column value that is unfoldable, like current_date(), can trigger this bug

@urosstan-db
Copy link
Contributor Author

can we add a test? Besically any default column value that is unfoldable, like current_date(), can trigger this bug

It is really strange, but I seen this issue on describe table also.

@urosstan-db
Copy link
Contributor Author

can we add a test? Besically any default column value that is unfoldable, like current_date(), can trigger this bug

Also, alter table add column default should trigger this bug?

@dtenedor
Copy link
Contributor

can we add a test? Basically any default column value that is unfoldable, like current_date(), can trigger this bug

Also, alter table add column default should trigger this bug?

For example, ALTER TABLE t ADD COLUMN c DEFAULT CURRENT_DATE()

Copy link
Contributor

@dtenedor dtenedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks good, thanks for fixing it! As Wenchen suggests we should add a test as well.

@@ -525,7 +525,7 @@ private[sql] object CatalogV2Util {
}

if (isDefaultColumn) {
val e = analyze(f, EXISTS_DEFAULT_COLUMN_METADATA_KEY)
val e = analyze(f, statementType = "Column conversion", EXISTS_DEFAULT_COLUMN_METADATA_KEY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val e = analyze(f, statementType = "Column conversion", EXISTS_DEFAULT_COLUMN_METADATA_KEY)
val e = analyze(
field = f,
statementType = "Column conversion",
metadataKey = EXISTS_DEFAULT_COLUMN_METADATA_KEY)

@@ -525,7 +525,7 @@ private[sql] object CatalogV2Util {
}

if (isDefaultColumn) {
val e = analyze(f, EXISTS_DEFAULT_COLUMN_METADATA_KEY)
val e = analyze(f, statementType = "Column conversion", EXISTS_DEFAULT_COLUMN_METADATA_KEY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At any rate, this looks like the right fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants