Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3347] improvement(jdbc-doris): support creating Doris table with partition #3350

Merged
merged 5 commits into from May 14, 2024

Conversation

xiaozcy
Copy link
Contributor

@xiaozcy xiaozcy commented May 11, 2024

What changes were proposed in this pull request?

support creating Doris table with partition

Why are the changes needed?

Fix: #3347

Does this PR introduce any user-facing change?

no

How was this patch tested?

IT

@xiaozcy xiaozcy changed the title improvement(jdbc-doris): support creating Doris table with partition [#3347] improvement(jdbc-doris): support creating Doris table with partition May 11, 2024
@jerryshao jerryshao requested a review from mchades May 11, 2024 12:18
@jerryshao
Copy link
Collaborator

@mchades please help to review this.

Set<String> columnNames =
Arrays.stream(columns).map(JdbcColumn::name).collect(Collectors.toSet());

if (partitioning[0] instanceof Transforms.RangeTransform) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhoukangcn
Could you help to review this PR if you have time? I see no problem here.

@xiaozcy xiaozcy closed this May 12, 2024
@xiaozcy xiaozcy reopened this May 12, 2024
@xiaozcy
Copy link
Contributor Author

xiaozcy commented May 12, 2024

I found that the previous implementation of multi-column range partitioning may conflict with the current API design of Gravitino, so we will not support multi-column range partitioning in Doris for the time being.

"The partition field must be one of the columns");

String partitionColumn = BACK_QUOTE + rangePartition.fieldName()[0] + BACK_QUOTE;
// TODO we currently do not support pre-assign partition when creating range partitioning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-assign partition has been available since merge #3189.

Would you like to implement it in this PR or a separate one? If not in this PR, kindly open a new issue for tracking this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for informing, I will do it in another PR, issue #3368 is created.

@xiaozcy xiaozcy closed this May 13, 2024
@xiaozcy xiaozcy reopened this May 13, 2024
Distributions.hash(DEFAULT_BUCKET_SIZE, NamedReference.field("col_1"));
Index[] indexes = new Index[] {};

// create table with range partition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to add a read/write test for the partitioned table in com.datastrato.gravitino.catalog.doris.integration.test.CatalogDorisIT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean adding integration tests for creating and loading partitioned table in CatalogDorisIT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And add read/write tests like HiveCatalogIT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we do not yet support the operations of pre-assign and add partitions for Doris, the created partitioned table cannot be written into directly.
In addition, we would have to bypass Gravitino for tests that read and write data, e.g. using spark or JDBC, which doesn't seem to be necessary for Gravitino in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because partition operations are not supported at the moment, it is impossible to test the read and write of partitioned tables. I think this makes sense.

In addition, we would have to bypass Gravitino for tests that read and write data, e.g. using spark or JDBC, which doesn't seem to be necessary for Gravitino in my opinion.

We need to ensure that tables created through Gravitino can be read and written smoothly by other engines, so it is very necessary to test reading and writing partitioned tables when the Doris catalog supports partition operations.

Copy link
Contributor

@mchades mchades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuqi1129 @zhoukangcn Overall LGTM, do you have further comments?

Copy link
Contributor

@mchades mchades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your contributions!

@mchades mchades merged commit ff6f620 into datastrato:main May 14, 2024
34 of 47 checks passed
github-actions bot pushed a commit that referenced this pull request May 14, 2024
…rtition (#3350)

### What changes were proposed in this pull request?

support creating Doris table with partition

### Why are the changes needed?

Fix: #3347 

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

IT

---------

Co-authored-by: zhanghan18 <zhanghan18@xiaomi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] Support creating Doris table with partition
4 participants