Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3337] improvement(hadoop-catalog): Support user impersonation for Hadoop catalog. #3352

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

yuqi1129
Copy link
Contributor

What changes were proposed in this pull request?

Add user impersonation for the Hadoop catalog.

Why are the changes needed?

We need authentication for the encrypted HDFS cluster.

Fix: #3337

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

UT(TO add).

@yuqi1129 yuqi1129 marked this pull request as draft May 11, 2024 14:00
@yuqi1129 yuqi1129 marked this pull request as ready for review May 15, 2024 12:20
@yuqi1129 yuqi1129 self-assigned this May 20, 2024
@jerryshao
Copy link
Collaborator

Is it ready for review?

@yuqi1129
Copy link
Contributor Author

Is it ready for review?

I'm afraid we need to add some tests using the HDFS cluster, not just a mini cluster here. If this does not matter, I think it's ready for review.

@jerryshao
Copy link
Collaborator

We could have a separate PR for e2e test, using mock test here to cover the logic should be enough.

@yuqi1129
Copy link
Contributor Author

yuqi1129 commented May 21, 2024

The following point needs clarification:

  • After introducing a common module that handles Kerberos authentication, the Hive catalog will depend on hadoop3(see below), is it acceptable? I think it's a bit risky to use a different release version for the Hive catalog.
image
  • If the first option is not desirable, I believe we may not be able to share code between modules that use Kerberos authentication, can we accept duplicated code of Kerberos code logic here?

@jerryshao
What's your opinion? I'm looking forward to your kind reply, thank you.

@jerryshao
Copy link
Collaborator

jerryshao commented May 21, 2024

I would choose the option 2 as the bottom line. If we have a better solution to avoid code duplication while not changing Hadoop version, that would be better.

@yuqi1129
Copy link
Contributor Author

@qqqttt123 @jerryshao
The code has been updated, please help to review it if you have time, thanks.

@yuqi1129
Copy link
Contributor Author

@jerryshao
Please help to review this pr, The Hadoop e2e test depends on this PR.

@yuqi1129 yuqi1129 closed this May 23, 2024
@yuqi1129 yuqi1129 reopened this May 23, 2024
@yuqi1129 yuqi1129 closed this May 24, 2024
@yuqi1129 yuqi1129 reopened this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] Authentication between the Hadoop catalog and Gravitino.
5 participants