Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automation providing capsule-proxy kubeConfig to in-cluster GitOps automation #770

Open
erikgb opened this issue Jun 9, 2023 · 5 comments
Labels
blocked-needs-validation Issue need triage and validation

Comments

@erikgb
Copy link

erikgb commented Jun 9, 2023

Describe the feature

We would like to use FluxCD (or other in-cluster GitOps tool) with Capsule to allow regular users to provision namespaces. First, we tried without the capsule-proxy, but it seems like the proxy is required for the Flux reconciling approach to work, and that's fair. Making the Flux tenant gitops-reconciler service account go through the proxy is possible, but currently requires manual cluster-admin work (ref. proxy-kubeconfig-generator in https://capsule.clastix.io/docs/guides/flux2-capsule/#the-recipe). This does not scale IMO, and I would like an automated solution to provide capsule-proxy kubeConfig.

Related issue: #608

What would the new user story look like?

As a cluster-admin I want automation to provide a capsule-proxy kubeConfig secret for in-cluster access to kube-apiserver through the proxy. This will allow me to avoid manual tasks for onboarding new tenants.

Expected behavior

Ref. https://capsule.clastix.io/docs/guides/flux2-capsule/#the-recipe, I think it the UI could as simple as annotating the tenant gitops-reconciler service account with an indication of the name of the secret the kubeConfig should be delivered to. Example:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: gitops-reconciler
  namespace: my-tenant
  annotations:
    capsule.clastix.io/kubeconfig-secret-name: gitops-reconciler-kubeconfig

This should make the automation (capsule-controller) deliver and maintain the kubeConfig secret:

apiVersion: v1
kind: Secret
metadata:
  name: gitops-reconciler-kubeconfig
  namespace: my-tenant
data:
  kubeconfig: ......
@erikgb erikgb added the blocked-needs-validation Issue need triage and validation label Jun 9, 2023
@prometherion
Copy link
Member

I like the idea of automating the deployment of the kubeconfig.

Since we're talking about the capsule-proxy, BTW, I think this should be delegated to that project, since without it the FluxCD integration cannot work as expected.

@MaxFedotov @maxgio92 @bsctl @oliverbaehler WDYT?

@erikgb
Copy link
Author

erikgb commented Jun 9, 2023

Since we're talking about the capsule-proxy, BTW, I think this should be delegated to that project, since without it the FluxCD integration cannot work as expected.

That makes sense. I wasn't fully aware of capsule-proxy being a separate project. Sorry.

@erikgb
Copy link
Author

erikgb commented Jun 10, 2023

After thinking a bit more about this, I think it makes more sense to focus on the secret and rely on a K8s feature, as this will probably make the machinery simpler and less error-prone. So my modified suggestion is the following.

The user creates the secret where he/she wants to obtain the capsule-proxy kubeconfig. The secret is created as request for a long-lived token as explained here, but with one additional annotation.

apiVersion: v1
kind: Secret
metadata:
  name: gitops-reconciler-kubeconfig
  namespace: my-tenant
  annotations:
    kubernetes.io/service-account.name: gitops-reconciler
    capsule.clastix.io/kubeconfig: generate
type: kubernetes.io/service-account-token

When the Kubernetes controller has done its job adding the token++ to the secret, the Capsule controller can use the token to generate the kubeConfig and add it to the secret. The resulting secret might look like this:

kind: Secret
apiVersion: v1
metadata:
  name: gitops-reconciler-kubeconfig
  namespace: my-tenant
  annotations:
    capsule.clastix.io/kubeconfig: generate
    kubernetes.io/service-account.name: gitops-reconciler
    kubernetes.io/service-account.uid: 7dfa681c-f6a9-4b15-969a-e5e7ac834ace
data:
  ca.crt: ....
  kubeconfig: ...
  service-ca.crt: ...
  token: ...
type: kubernetes.io/service-account-token

@maxgio92
Copy link
Collaborator

maxgio92 commented Jun 10, 2023

Tracking

@prometherion I agree, we can move this issue to capsule-proxy repository.

The proposal

Thank you @erikgb for this feedback and proposal.

I totally get your point and actually, we had in plan the automation it's missing now to enable the integration to scale. So, I'm glad we're on the same page.

Kubeconfig setup

About the implementation: I like the idea of, as a platform admin, annotating the Tenant Owner ServiceAccount, to let Capsule know that Tenant Owner needs to operate in-cluster and its client needs a kubeconfig.

The token, from Kubernetes 1.24, could be missing and we need to address what responsibility the Capsule Proxy admission controllers would have with respect to it: should it automatically create, when missing, the long-lived token?

Instead, I see the Secret as a storage format for keeping that client config. So, I wouldn't ask the platform admin for the responsibility of managing it.

RBAC setup

In general, keep in mind that the ServiceAccount acts a Tenant Owner robot account. With that, the documented RBAC is needed to let Flux apply controllers operate with it on the specific Tenant-scope. This needs to be considered on the onboarding of the GitOps-managed Tenant.

As the RBAC depends on the integration with Flux (e.g. the needed permission to impersonate), I see this automation should be part of an external project, like an addon.

What do you think?

@erikgb
Copy link
Author

erikgb commented Jun 11, 2023

@maxgio92 I am pretty sure the machinery will be a lot simpler if we head for the secret, instead of the SA. By addressing the secret, we don't have to deal with a typical controller issue: two (or more) service accounts trying to "own" the same secret. Owner references and garbage collection will also be taken care of by something else.

I am not sure if I understand what RBAC you are referring to. The documentation is actually outdated for newer versions of Flux (after fluxcd/flux2#3566). No need to grant anything more than the Capsule default admin cluster role to tenant owners (including the Flux GitOps reconcile SA). I also don't see why the RBAC is relevant for this issue? Am I missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked-needs-validation Issue need triage and validation
Projects
None yet
Development

No branches or pull requests

3 participants