Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#2543] feat(spark-connector): support row-level operations to iceberg Table #3243

Merged
merged 13 commits into from May 13, 2024

Conversation

caican00
Copy link
Contributor

@caican00 caican00 commented May 1, 2024

What changes were proposed in this pull request?

  • refactor table implementation, make SparkIcebergTable extend Iceberg SparkTable, and SparkHiveTable extend Kyuubi HiveTable.

  • support row-level operations to iceberg Table

1. update tableName set c1=v1, c2=v2, ...

2. merge into targetTable t
   using sourceTable s
   on s.key=t.key
   when matched then ...
   when not matched then ...

3. delete from table where xxx

Why are the changes needed?

  1. For spark-connector in Iceberg, it explicitly uses SparkTable to identify whether it is an Iceberg table, so the SparkIcebergTable must extend SparkTable.

  2. support row-level operations to iceberg Table.

Fix: #2543

Does this PR introduce any user-facing change?

Yes, support update ... , merge into ..., delete from ...

How was this patch tested?

New ITs.

@caican00 caican00 changed the title [#2543] feat(spark-connector): support row-level operations to iceber… [#2543] feat(spark-connector): support row-level operations to iceberg Table May 1, 2024
@caican00 caican00 force-pushed the refactor-table-and-support-row-level branch 2 times, most recently from d66ff10 to 4d334aa Compare May 2, 2024 00:20
@caican00 caican00 force-pushed the refactor-table-and-support-row-level branch from 4d334aa to ae71559 Compare May 5, 2024 15:57
@caican00
Copy link
Contributor Author

caican00 commented May 6, 2024

Hi @FANNG1 could you help review this pr? thank you.

@caican00
Copy link
Contributor Author

caican00 commented May 7, 2024

all comments have been addressed. cc @FANNG1

@caican00 caican00 requested a review from FANNG1 May 7, 2024 10:04
@caican00
Copy link
Contributor Author

all comments have been addressed except one in discussion. cc @FANNG1

@FANNG1
Copy link
Contributor

FANNG1 commented May 11, 2024

@jerryshao @qqqttt123 do you have time to review? Overall LGTM except minor comment.

@caican00
Copy link
Contributor Author

caican00 commented May 11, 2024

@FANNG1 all comments have been addressed.

@qqqttt123
Copy link
Contributor

If this is a user-facing change, you should add the documents.

…can00/gravitino into refactor-table-and-support-row-level
@caican00
Copy link
Contributor Author

If this is a user-facing change, you should add the documents.

done.

@caican00
Copy link
Contributor Author

@FANNG1 could you help review again?

@FANNG1
Copy link
Contributor

FANNG1 commented May 13, 2024

@caican00 could you fix the conflict?

@FANNG1
Copy link
Contributor

FANNG1 commented May 13, 2024

@qqqttt123 do you have other comments?

@caican00
Copy link
Contributor Author

@caican00 could you fix the conflict?

done

@caican00
Copy link
Contributor Author

Hi @FANNG1 @qqqttt123 are there any else comments to fix?

@FANNG1
Copy link
Contributor

FANNG1 commented May 13, 2024

Hi @FANNG1 @qqqttt123 are there any else comments to fix?

LGTM,will merge the PR if no other comments this afternoon

@FANNG1 FANNG1 merged commit f207466 into datastrato:main May 13, 2024
22 checks passed
github-actions bot pushed a commit that referenced this pull request May 13, 2024
…g Table (#3243)

### What changes were proposed in this pull request?

- refactor table implementation, make `SparkIcebergTable` extend Iceberg
`SparkTable`, and `SparkHiveTable` extend Kyuubi `HiveTable`.

- support row-level operations to iceberg Table

```
1. update tableName set c1=v1, c2=v2, ...

2. merge into targetTable t
   using sourceTable s
   on s.key=t.key
   when matched then ...
   when not matched then ...

3. delete from table where xxx
```

### Why are the changes needed?

1. For spark-connector in Iceberg, it explicitly uses `SparkTable` to
identify whether it is an Iceberg table, so the `SparkIcebergTable` must
extend `SparkTable`.

2. support row-level operations to iceberg Table.

Fix: #2543

### Does this PR introduce any user-facing change?
Yes, support update ... , merge into ..., delete from ...

### How was this patch tested?
New ITs.
@FANNG1
Copy link
Contributor

FANNG1 commented May 13, 2024

@caican00 great process for supporting the Iceberg catalog, thanks for your work.

Copy link
Contributor

@qqqttt123 qqqttt123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

caican00 added a commit to caican00/gravitino that referenced this pull request May 14, 2024
… to iceberg Table (datastrato#3243)

### What changes were proposed in this pull request?

- refactor table implementation, make `SparkIcebergTable` extend Iceberg
`SparkTable`, and `SparkHiveTable` extend Kyuubi `HiveTable`.

- support row-level operations to iceberg Table

```
1. update tableName set c1=v1, c2=v2, ...

2. merge into targetTable t
   using sourceTable s
   on s.key=t.key
   when matched then ...
   when not matched then ...

3. delete from table where xxx
```

### Why are the changes needed?

1. For spark-connector in Iceberg, it explicitly uses `SparkTable` to
identify whether it is an Iceberg table, so the `SparkIcebergTable` must
extend `SparkTable`.

2. support row-level operations to iceberg Table.

Fix: datastrato#2543

### Does this PR introduce any user-facing change?
Yes, support update ... , merge into ..., delete from ...

### How was this patch tested?
New ITs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] support row-level operations to iceberg Table
3 participants