New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#3347] improvement(jdbc-doris): support creating Doris table with partition #3350
Conversation
@mchades please help to review this. |
...ris/src/main/java/com/datastrato/gravitino/catalog/doris/operation/DorisTableOperations.java
Outdated
Show resolved
Hide resolved
Set<String> columnNames = | ||
Arrays.stream(columns).map(JdbcColumn::name).collect(Collectors.toSet()); | ||
|
||
if (partitioning[0] instanceof Transforms.RangeTransform) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhoukangcn
Could you help to review this PR if you have time? I see no problem here.
I found that the previous implementation of multi-column range partitioning may conflict with the current API design of Gravitino, so we will not support multi-column range partitioning in Doris for the time being. |
...ris/src/main/java/com/datastrato/gravitino/catalog/doris/operation/DorisTableOperations.java
Show resolved
Hide resolved
"The partition field must be one of the columns"); | ||
|
||
String partitionColumn = BACK_QUOTE + rangePartition.fieldName()[0] + BACK_QUOTE; | ||
// TODO we currently do not support pre-assign partition when creating range partitioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-assign partition has been available since merge #3189.
Would you like to implement it in this PR or a separate one? If not in this PR, kindly open a new issue for tracking this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for informing, I will do it in another PR, issue #3368 is created.
Distributions.hash(DEFAULT_BUCKET_SIZE, NamedReference.field("col_1")); | ||
Index[] indexes = new Index[] {}; | ||
|
||
// create table with range partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to add a read/write test for the partitioned table in com.datastrato.gravitino.catalog.doris.integration.test.CatalogDorisIT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean adding integration tests for creating and loading partitioned table in CatalogDorisIT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And add read/write tests like HiveCatalogIT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we do not yet support the operations of pre-assign and add partitions for Doris, the created partitioned table cannot be written into directly.
In addition, we would have to bypass Gravitino for tests that read and write data, e.g. using spark or JDBC, which doesn't seem to be necessary for Gravitino in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because partition operations are not supported at the moment, it is impossible to test the read and write of partitioned tables. I think this makes sense.
In addition, we would have to bypass Gravitino for tests that read and write data, e.g. using spark or JDBC, which doesn't seem to be necessary for Gravitino in my opinion.
We need to ensure that tables created through Gravitino can be read and written smoothly by other engines, so it is very necessary to test reading and writing partitioned tables when the Doris catalog supports partition operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuqi1129 @zhoukangcn Overall LGTM, do you have further comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your contributions!
…rtition (#3350) ### What changes were proposed in this pull request? support creating Doris table with partition ### Why are the changes needed? Fix: #3347 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? IT --------- Co-authored-by: zhanghan18 <zhanghan18@xiaomi.com>
What changes were proposed in this pull request?
support creating Doris table with partition
Why are the changes needed?
Fix: #3347
Does this PR introduce any user-facing change?
no
How was this patch tested?
IT