Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a method for attaching sidecars (or patching both the Deployment and Job simultaneously) #249

Open
jawnsy opened this issue Aug 20, 2023 · 3 comments

Comments

@jawnsy
Copy link
Contributor

jawnsy commented Aug 20, 2023

Summary

Provide some means of attaching a sidecar to both the Deployment and Job.

Background

Google Cloud SQL supports encryption and IAM authentication using the Cloud SQL Proxy service, running as a sidecar container.

The recommended deployment methodology is to use a sidecar container, because the proxy does not support authentication (anyone connecting to the proxy inherits the credentials that the proxy can access, so using a sidecar is the safest way to ensure that only authorized workloads can connect through the proxy).

Workarounds

  • I think this is doable with patches to both the Deployment and Job, but it is a little bit tedious, because it has to be written twice (once to patch the cluster Deployment, and again to patch the migration Job)
  • If using a connection pooler (e.g. PgBouncer), then SpiceDB can connect to the pooler (with authentication) and the pooler can forward connections to the proxy (running as a sidecar)
  • We can run the proxy as an independent Deployment and use a NetworkPolicy to restrict access, but this is risky as not all CNI plugins will enforce NetworkPolicy
@jawnsy
Copy link
Contributor Author

jawnsy commented Aug 20, 2023

Even if you patch things, the migration job does not quite work correctly, because:

  1. The migration container always expects the database to be accessible, but when the Cloud SQL Auth Proxy is starting up, it will not be ready yet - a solution is for the migration service to retry every few seconds until it succeeds (though Kubernetes will detect the migration container as "crashed" and restart it anyways)
  2. The proxy container will still be running after the migration container exits, so the job will not complete

So, for now, perhaps the best option is to use a username/password for database authentication.

@ecordell
Copy link
Contributor

ecordell commented Aug 25, 2023

This isn't an option for most kube clusters in the wild just yet, but https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#api-for-sidecar-containers I think would at least make patching the job work for this. Any chance you're on a cluster that you can enable Alpha features on?

The proxy container will still be running after the migration container exits, so the job will not complete

This is interesting. The https://github.com/GoogleCloudPlatform/cloud-sql-proxy-operator supports injecting into Jobs, but I don't see how anyone can use that feature.

A hacky way could be a timeout on the sql proxy pod? Give the proxy 1 minute to run migrations and then exit 0 so that the migration container controls overall job success (but if your data gets very large you might need to play with that number).

I did find this writeup: https://medium.com/teamsnap-engineering/properly-running-kubernetes-jobs-with-sidecars-ddc04685d0dc which suggests sharing the process namespace between the pods and killing the proxy process when the primary pod completes. That's an option but seems like a lot of work to replace a thing that's already built-in to newer versions of kube.

@adamstrawson
Copy link

If it helps at all, we use cloud-sql-proxy sidecars on various migrations, they use a quitquitquit standard (which is becoming more commonly used) as a way of issuing a SIGTERM to the container once the migration finishes.

In our case as an example, the sidecar container for cloud-sql-proxy looks like:

       - name: cloud-sql-proxy
          image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.1
          args:
          <snip>
          - "--quitquitquit"

And then on the service (which is setup with Helm)

      automigration:
        enabled: true
        customCommand: [/bin/sh, -c]
        customArgs:
        - migrate sql -e --yes; wget --post-data '{}' -O /dev/null -q http://127.0.0.1:9091/quitquitquit

(Only wget is available in this container, not ideal, but working with what we have avaliable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants