Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImagePullJob Support ECR (STS tokens) Repositories #866

Open
vikas027 opened this issue Dec 25, 2021 · 8 comments · May be fixed by #1383
Open

ImagePullJob Support ECR (STS tokens) Repositories #866

vikas027 opened this issue Dec 25, 2021 · 8 comments · May be fixed by #1383
Labels
help wanted Extra attention is needed kind/feature-request kind/good-idea Good Idea
Projects
Milestone

Comments

@vikas027
Copy link

vikas027 commented Dec 25, 2021

What would you like to be added:
I am trying to use an ECR Image in PumagePullJob which fails with these errors

kruise-daemon-czjtm daemon W1225 08:08:02.435660       1 imagepuller_worker.go:328] Worker failed to pull image 111111111111.dkr.ecr.ap-southeast-2.amazonaws.com/traefik:v2.5.3, cost 15.097071799s, err: Error response from daemon: Head "https://111111111111.dkr.ecr.us-east-1.amazonaws.com/v2/traefik/manifests/v2.5.3": no basic auth credentials

My EKS nodes have IAM Roles configured to pull images from the ECR repositories, it looks like the go library(ies) is unable to use AWS STS tokens.

Why is this needed:
The workaround is to have a kubernetes docker secret with the token and then keep on renewing it, but it would be better to use the IAM roles and use STS tokens (just like native EKS workloads)

@FillZpp
Copy link
Member

FillZpp commented Dec 27, 2021

IMHO, Kruise should not care about cloud providers. It just works on pure Kubernetes API.

The workaround is to have a kubernetes docker secret with the token and then keep on renewing it

I think EKS should the ability to manage and renew the secret for its users.

@vikas027
Copy link
Author

I think EKS should the ability to manage and renew the secret for its users.

Workloads on EKS does not need secrets by default, they just work as long as the nodes have proper IAM policies. EKS already takes care of the same i.e. whenever we have a docker image (like 111111111111.dkr.ecr.ap-southeast-2.amazonaws.com/traefik:v2.5.3), nodes can pull the docker image without worrying about the repository/registry authentication. My idea was to use the same/similar logic in Kruise too, it is just a good customer experience IMHO.

As I said earlier, I already have created a workaround which is to populate and renew the secret at regular intervals.

@FillZpp
Copy link
Member

FillZpp commented Dec 27, 2021

nodes can pull the docker image without worrying about the repository/registry authentication

I'm not sure how it works... Maybe they have modified the code of EKS kubelet, which can pull images from ECR with IAM Roles configured somewhere. Does it have some documentations of this?

@FillZpp FillZpp added this to Proposal in Roadmap via automation Dec 27, 2021
@abatilo
Copy link

abatilo commented Mar 4, 2023

Hi all. Is there something that I can do to help push this along?

@furykerry furykerry assigned veophi and unassigned jian-he Mar 6, 2023
@furykerry
Copy link
Member

Hi all. Is there something that I can do to help push this along?

Can you help us identify how EKS inject tokens into the pod? For example , can you check created pod yaml and check if pull secrets are injected by some webhook automatically.

@abatilo
Copy link

abatilo commented Mar 6, 2023

@furykerry I can definitely help with that!

The native way that EKS supports pod identity is with a feature that they call IAM roles for service accounts. The way this works is that you add an annotation to a kind: ServiceAccount, then use that kind: ServiceAccount on the pod spec.

The EKS control plane then implements the webhook injection with this aws/amazon-eks-pod-identity project, which will inject a kind: Secret with the kind: ServiceAccount token, and will inject environment variables into the pod spec.

All AWS SDKs for every programming language looks to see if certain environment variables exist, and if they do, they will use the service account token and the identity of the pod to fetch AWS credentials which can then be used to call the AWS API.

Please let me know if any part of that doesn't make sense!

@abatilo
Copy link

abatilo commented Mar 7, 2023

To anyone who comes across this topic, here's the terraform that I wrote to automate fetching an ECR token that OpenKruise can use for things like an ImagePullJob

https://gist.github.com/abatilo/d9234c85c420d08688fd353c591fef6d

@furykerry
Copy link
Member

@zmberg zmberg added help wanted Extra attention is needed kind/good-idea Good Idea labels Jun 1, 2023
@zmberg zmberg changed the title [feature request] - Support ECR (STS tokens) Repositories [GLCC] - Support ECR (STS tokens) Repositories Jun 5, 2023
@zmberg zmberg moved this from Proposal to To do in Roadmap Jun 8, 2023
@zmberg zmberg changed the title [GLCC] - Support ECR (STS tokens) Repositories Support ECR (STS tokens) Repositories Jun 19, 2023
@zmberg zmberg changed the title Support ECR (STS tokens) Repositories ImagePullJob Support ECR (STS tokens) Repositories Jul 13, 2023
@Kuromesi Kuromesi linked a pull request Aug 23, 2023 that will close this issue
@zmberg zmberg modified the milestones: 1.8, 1.9 Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed kind/feature-request kind/good-idea Good Idea
Projects
Roadmap
  
To do
Development

Successfully merging a pull request may close this issue.

7 participants