Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize for BYO cluster: create outside process to manage the creation of our KSA-bound cloud IAM principals (sci-${provider}) #178

Open
brandonjbjelland opened this issue Aug 10, 2023 · 2 comments

Comments

@brandonjbjelland
Copy link
Contributor

brandonjbjelland commented Aug 10, 2023

What's the issue?

We want to unbundle some parts of the aws and gcp install process such that a user bringing their own closer can identify the right entrypoint and setup the rest of their environment to get started using substratus. The IAM principal + role bindings is a good place to start. The bucket, registry, also fit in this camp. Daemonset and other cluster-wide dependencies are a final category.

Why make this change?

Discussed here.

This change would allow us to optimize for users who bring a well-configured cluster, unlikely as that might be (node pools/groups, daemonsets, an appropriate storage driver).

As a larger point, I think this makes a case to unbundle many parts of the provider spin up process such that those bits become reusable to folks entering at different points. The helm install of nvidia-device-plugin and karpenter on AWS are already well-positioned to be broken out as independent install scripts and called by aws-up.sh.

Related: #112

@BOsterbuhr
Copy link

Let's use me as one extreme example; I have an EKS cluster, S3 bucket, RDS instance, Karpenter, and nvidia-device-plugin installed. Is there any more infrastructure I need to provision, or is it only IAM related configuration left before I can apply the config namespace and system yaml files?

@samos123
Copy link
Contributor

That's the end goal here, but note that we're not there yet. You're spot on about what you would need except the RDS instance, we don't use any RDS nor planning to in the short-term. You would indeed only need create an IAM role with enough permissions and allow K8s SA in to assume that role. Brandon and I are working on implementing this proposal for both AWS and GCP: https://github.com/substratusai/substratus/blob/main/docs/proposals/operator-managed-infra.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants