New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure putSnapshot path honoring case insensitive contract #85
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the issue related to CTAS? It failed in the regular write path.
apps/spark/src/test/java/com/linkedin/openhouse/catalog/e2e/CatalogOperationTest.java
Outdated
Show resolved
Hide resolved
The ticket you gave has the log line which located in this class. |
apps/spark/src/main/java/com/linkedin/openhouse/jobs/spark/Operations.java
Outdated
Show resolved
Hide resolved
...s-test-fixtures/src/main/java/com/linkedin/openhouse/tablestest/HouseTablesH2Repository.java
Show resolved
Hide resolved
services/tables/src/main/java/com/linkedin/openhouse/tables/services/TablesServiceImpl.java
Show resolved
Hide resolved
...tables/src/main/java/com/linkedin/openhouse/tables/services/IcebergSnapshotsServiceImpl.java
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/util/SparkCatalogUtils.java
Outdated
Show resolved
Hide resolved
apps/spark/src/test/java/com/linkedin/openhouse/catalog/e2e/CatalogOperationTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these changes! unit tests are also strong to tackle case Sensitive issues. Few recommendations
services/tables/src/main/java/com/linkedin/openhouse/tables/utils/TableUUIDGenerator.java
Outdated
Show resolved
Hide resolved
services/tables/src/main/java/com/linkedin/openhouse/tables/utils/TableUUIDGenerator.java
Show resolved
Hide resolved
services/tables/src/main/java/com/linkedin/openhouse/tables/utils/TableUUIDGenerator.java
Show resolved
Hide resolved
...s/spark/openhouse-spark-runtime/src/main/java/com/linkedin/openhouse/spark/CatalogUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments
...s/spark/openhouse-spark-runtime/src/main/java/com/linkedin/openhouse/spark/CatalogUtils.java
Outdated
Show resolved
Hide resolved
...s/spark/openhouse-spark-runtime/src/main/java/com/linkedin/openhouse/spark/CatalogUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing comments.
Summary
This is a bug fixes for put-snapshot code path, where UUID-extraction process is using
tableId
anddatabaseId
from the request itself. Those ids, if directly obtained from top-level request body, can lost casing if the requests were from platform like Spark SQL. Since the underlying storage (e.g. HDFS ) is case sensitive in its path URL, we will need to ensure the original casing info when issue the first commit as part of CTAS is preserved in the process of UUID-extraction of the second commit.The other parts in this PR is to ensure, when
put
is happening, thetableDto
is not always built from scratch when there's existing object discovered byfindById
method previously. This is done by switchorElse
method toorElseGet
, in which the latter will only call the supplier lazily when the calling object is absent. This leads to a wasteful implementation as well as confusion on the code stack.This PR also include:
text-fixtures
module has its repository horning case insensitive contract so that this behavior can be tested through embedded instances.Changes
For all the boxes checked, please include additional details of the changes made in this pull request.
Testing Done
For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.
Additional Information
For all the boxes checked, include additional details of the changes made in this pull request.