Improve git credential handling in Kubernetes #1577

sgaist · 2022-11-17T16:02:51Z

The git credentials are now stored in a secret following recommended best practice.

The secret will have the same lifespan as the builder pod.

sgaist · 2022-11-17T16:05:19Z

One note: we might also want to consider making the pod owner of the secret which would automatically trigger its deletion along side the pod.

However, if for some reason the pod fails to be created the secret would remain so there's a need to clean it up manually in that case.

binderhub/build.py

consideRatio

A chart managed secret / a dynamically created secret?

Is the git credentials the same for all build pods? If it is, we can just have a Helm chart template create a k8s Secret with a key in it representing the credentials.

Single / Multiple k8s Secrets

Have it been considered to use just one single secret that has several keys, one per build pod? Could be better, could be worse. It would feel good to know we have reflected on it a bit.

consideRatio

binderhub/binderhub/build.py

Line 79 in eb0d9ee

git_credentials = Unicode(

I think the way to go is a much smaller change, where the pod we dynamically create just references a Helm chart managed k8s Secret. If this is to create a new k8s Secret template, or to reference an existing k8s Secret, is not something I've explored yet.

The k8s Secret needs to be located in the same namespace as the build pods, but a Helm chart can specify a custom namespace if needed in a k8s resource's metadata.

sgaist · 2022-11-18T10:09:07Z

@consideRatio Thanks for your suggestions. To answer your questions/remarks:

Currently the credentials are static by provider but my implementation was done so on purpose: it takes into account #1169 (Allow dynamic repository credentials for authenticated Binderhub instances) that I am interested in as well.

###A chart managed secret / a dynamically created secret?

If we take the static credential use case, having one single chart managed secret is a good idea. The git_credentials property could be nuked in favor of setting the key matching the repo provider.

If we take the #1169 use case into account, we would need to ensure that we cleanup the content of the secret to avoid keeping tokens (and a growing secret) for more time than necessary (i.e. user session length + builder lifetime).

From the top of my head, this could be done in the following way:

Start build
Update secret with token using pod name as key
Run build
Update secret and remove the token

The thing we need to determine now is how to differentiate a static credential VS a user provided one.

###Single / Multiple k8s Secrets

In the static case, I think one single secret would make sense.

For the #1169 use case, it's a different story. We would likely need to benchmark things a bit. Kubernetes secrets cannot be larger than 1MIB but one should also not create too many little Secrets in a given namespace as it could also exhausts memory. The documentation also suggest to use resource quotas in order to limit their number.

consideRatio · 2022-11-18T12:44:46Z

Ah okay!

This fix becomes more complicated if planning for #1169 at the same time, to support runtime decided git credentials, but we can go for it if you want. I'd appreciate if another maintainer could opinie about the questions below though, as I'm not comfortable approve/merging this without further agreement otherwise.

Question 1: Is there agreement to add support for runtime decided git credentials?

Doing it imples managing k8s Secrets via the binderhub software itself, as compared to the BinderHub Helm chart.

Question 2: Should we support only runtime decided git credentials, or a mix?

If using binderhub software managed k8s Secrets by default, it means that we take some performance hit as pod must wait for the k8s Secret creation, which follows the pod creation in sequence, and burden the k8s api-server a bit by additional requests. It may be miniscule though. So, if we switch to this where its not used, this is a compromise even though it may possibly be very small. At the same time, if we start supporting both options with a single Helm chart managed k8s Secret and runtime created k8s Secrets, we increase the complexity of configuration/documentation and the complexity of the code base as a whole.

Implementation criterias for runtime decided git credentials

Use ownerReferences
This seems like the only robust way to ensure that we cleanup secondary resources without needing to track another type of k8s resource. There is an implementation for this in kubespawner already, there may be room for improvements in how its implemented but a key implementation detail is that the k8s Secret should be created after the Pod.
Use of typical labels for created resources
Any resources created should have a few labels to indicate who created it. Helm charts creating resources should have labels like these, where its not obvious what we should set when its resources created by software, which in turn was installed by a Helm chart. I'm not 100% on what values we provide, or if we should set the helm.sh/chart label as its indirectly created only, but we should at least provide some labels indicating this origin I think.

consideRatio · 2022-11-18T12:53:50Z

My own response to the questions above:

Q1: I think allowing for runtime decided git credentials is acceptable to support, that the complexity added is sufficiently motivated, but I'm not fully confident about this.
Q2: I think I'd rather see a single flexible approach that then must be found acceptable for everyone, even for those not using this, as that would reduce the complexity of the system as a whole. I'm worried otherwise that this strategy is used by very few, and we need to ask "oh are you using ... or not?" and giving special treatments to this when testing and trying to come to the botton of a reported issue etc.

@minrk you worked to provide the kubespawner implementation dynamically creating k8s Secret resources for internal_ssl, what do you think?

sgaist · 2022-11-18T20:37:55Z

On the implementation side, the secret can be created before the pod and the ownership applied through patching. I did a small test locally. It can be done as soon as the pod enters the pending state.

I haven't benchmarked it but it would likely reduce the performance hit of having the pod waiting on the secret presence to be properly started.

The downside is that if the pod creation fails the secret needs to be manually cleaned up.

consideRatio · 2022-11-18T23:43:02Z

The downside is that if the pod creation fails the secret needs to be manually cleaned up.

Yepp, a quite significant downside i think. I'd rather make sure we dont emit credentials we fail to clean up than optimize performance to this degree - based on guessed impact scale.

The git credentials are now stored in a temporary secret following recommended best practice. The secret will have the same lifespan as the builder pod.

This makes the code simpler and lets k8s encode the data

The final implementation comes from the following design decision: Only create the secret once the pod is ready so we reduce the surface were sensitive data is managed and avoid its management if the pod fails for some reason.

sgaist · 2022-12-16T09:48:09Z

@consideRatio I was wondering one thing: should the secret creation code be put in its own function ?

manics · 2022-12-17T20:18:48Z

Note the git credential is also needed by BinderHub before the build pod is created, since BinderHub checks the registry for an existing image so the Git SHA is required:

binderhub/binderhub/builder.py

Lines 269 to 297 in 8351995

    
           try: 
        
               provider = self.get_provider(provider_prefix, spec=spec) 
        
           except Exception as e: 
        
               app_log.exception("Failed to get provider for %s", key) 
        
               await self.fail(str(e)) 
        
               return 
        
           if provider.is_banned(): 
        
               await self.emit( 
        
                   { 
        
                       "phase": "failed", 
        
                       "message": f"Sorry, {spec} has been temporarily disabled from launching. Please contact admins for more info!", 
        
                   } 
        
               ) 
        
               return 
        
           repo_url = self.repo_url = provider.get_repo_url() 
        
           # labels to apply to build/launch metrics 
        
           self.repo_metric_labels = { 
        
               "provider": provider.name, 
        
               "repo": repo_url, 
        
           } 
        
           try: 
        
               ref = await provider.get_resolved_ref() 
        
           except Exception as e: 
        
               await self.fail(f"Error resolving ref for {key}: {e}") 
        
               return

This may complicated things if implementing dynamic per-user/repo Git credentials. I've hit a similar issue whilst trying to disentangle another property:
#1594

If we go down the route of dynamic docker config it would make sense to keep these two aligned, so if you have any thoughts on that issue please share them!

manics · 2023-02-11T18:45:47Z

I've been thinking about this because I want to get temporary registry credentials into the build image to push to ECR (push tokens are time-limited so a static secret won't work #1623).

Using secrets is definitely best practice, but on balance I think managing dynamic secrets overcomplicates things for BinderHub due to the clean-up issue- it's a shame you can't atomically create a secret+pod. ownerReferences may help, but there's a circular dependency between the pod and secret and therefore a race condition.

In the worst case the implementation in https://github.com/jupyterhub/kubespawner/blob/0f22abf096459b5424325fa93ff4b30bef568391/kubespawner/spawner.py#L2714-L2732 will lead to N orphaned secrets where N = number of users, whereas in Binderhub the secrets could be per-user-per-repo so there's an unlimited number.

If we want to pursue this I think a compromise of using a Helm chart managed secret for global credentials, and plain env-vars for dynamic credentials, is reasonable. It requires additional logic in the build class to choose between mounting the global secret vs ignoring it and setting an environment variable, but we'll need additional logic to support dynamic credentials anyway. The small risk of using env vars can be mitigated by ensuring any dynamic credentials are time limited- they most likely will be anyway, e.g. OAuth tokens expire, as do ECR registry credentials. You could probably make the case that a dynamic expiring env var is more secure than a one year old static secret 😃

Finally if designed correctly this will be an implementation detail so there's nothing stopping us from switching in future.

sgaist · 2023-02-17T09:34:46Z

Related to your post from past December (that I was sure I answered to so sorry for the delay):
Where you writing in the context of pushing the image back to the same repository as the code comes from ? Otherwise, it's one unique token that could be stored as a static secret managed by helm.

For your new setup, it sounds a bit like the forge use case. In between, one related thing I have discovered that could be of interest: the Secrets Store CSI driver. Note that I haven't tested it yet.
The goal of this project is to move the secrets into a dedicated strong storage and the pods use a dedicated provider to retrieve them.
There's an AWS provider however I currently don't know if it would be usable for your use case.
While interesting, it adds a new layer of indirection which needs to be evaluated first.

As for the orphaned secrets, I am wondering whether having a cleanup CronJob would help in that case ? Note that the current implementation should properly clean everything in case of failure.

sgaist force-pushed the improve_git_credential_handling branch from 186c740 to eb0d9ee Compare November 17, 2022 16:46

consideRatio reviewed Nov 17, 2022

View reviewed changes

binderhub/build.py Outdated Show resolved Hide resolved

consideRatio reviewed Nov 17, 2022

View reviewed changes

consideRatio requested changes Nov 17, 2022

View reviewed changes

consideRatio added the maintenance Under the hood improvements and fixes label Nov 18, 2022

consideRatio mentioned this pull request Nov 18, 2022

new OVH cluster jupyterhub/mybinder.org-deploy#2414

Merged

9 tasks

sgaist added 3 commits December 5, 2022 17:20

refactor: improve git credential handling in Kubernetes

55bbdb6

The git credentials are now stored in a temporary secret following recommended best practice. The secret will have the same lifespan as the builder pod.

refactor: using stringData field to setup secret

b5eef26

This makes the code simpler and lets k8s encode the data

refactor: make the pod owner of the secret once created

274e525

sgaist force-pushed the improve_git_credential_handling branch from d62951e to fc24709 Compare December 5, 2022 16:21

refactor: move the secret creation after the pod is created

a81abe3

The final implementation comes from the following design decision: Only create the secret once the pod is ready so we reduce the surface were sensitive data is managed and avoid its management if the pod fails for some reason.

sgaist force-pushed the improve_git_credential_handling branch from fc24709 to a81abe3 Compare December 5, 2022 20:14

manics mentioned this pull request Dec 17, 2022

buildDockerConfig is broken, binder-build-docker-config has multiple uses #1594

Open

manics mentioned this pull request Jun 27, 2023

Support dynamic push credentials obtained from registry #1724

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve git credential handling in Kubernetes #1577

Improve git credential handling in Kubernetes #1577

sgaist commented Nov 17, 2022

sgaist commented Nov 17, 2022

consideRatio left a comment •

edited

consideRatio left a comment

sgaist commented Nov 18, 2022

consideRatio commented Nov 18, 2022

consideRatio commented Nov 18, 2022

sgaist commented Nov 18, 2022

consideRatio commented Nov 18, 2022 •

edited

sgaist commented Dec 16, 2022

manics commented Dec 17, 2022

manics commented Feb 11, 2023 •

edited

sgaist commented Feb 17, 2023

Improve git credential handling in Kubernetes #1577

Are you sure you want to change the base?

Improve git credential handling in Kubernetes #1577

Conversation

sgaist commented Nov 17, 2022

sgaist commented Nov 17, 2022

consideRatio left a comment • edited

Choose a reason for hiding this comment

A chart managed secret / a dynamically created secret?

Single / Multiple k8s Secrets

consideRatio left a comment

Choose a reason for hiding this comment

sgaist commented Nov 18, 2022

consideRatio commented Nov 18, 2022

Question 1: Is there agreement to add support for runtime decided git credentials?

Question 2: Should we support only runtime decided git credentials, or a mix?

Implementation criterias for runtime decided git credentials

consideRatio commented Nov 18, 2022

sgaist commented Nov 18, 2022

consideRatio commented Nov 18, 2022 • edited

sgaist commented Dec 16, 2022

manics commented Dec 17, 2022

manics commented Feb 11, 2023 • edited

sgaist commented Feb 17, 2023

consideRatio left a comment •

edited

consideRatio commented Nov 18, 2022 •

edited

manics commented Feb 11, 2023 •

edited