Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation/support for Google Cloud SQL Auth Proxy or Connector #348

Open
jawnsy opened this issue Dec 18, 2022 · 2 comments
Open

Documentation/support for Google Cloud SQL Auth Proxy or Connector #348

jawnsy opened this issue Dec 18, 2022 · 2 comments
Labels

Comments

@jawnsy
Copy link
Contributor

jawnsy commented Dec 18, 2022

Summary

Support the Cloud SQL Auth Proxy or Connector.

Background

The current setup instructions for Cloud SQL uses password authentication:

CREATE USER pganalyze WITH PASSWORD 'waJzOTWiQQjCZLr4' CONNECTION LIMIT 5;

Cloud SQL instances that use the (somewhat misleadingly-named) Require SSL setting do not support password authentication; instead, mutual TLS with client certificates must be used instead (from the referenced page):

Enforcing SSL/TLS encryption as described below will make mutual TLS required for all clients

The CIS Benchmark for Google Cloud Platform, version 1.3.0, also recommends enabling this flag, so anyone running compliance scans will see this as a finding.

For anyone using Require SSL, it is often convenient to use the Cloud SQL Auth Proxy, which handles generating and rotating ephemeral certificates using an IAM identity. This can run as a sidecar and is compatible with pganalyze-collector as it exists today. Additionally, enabling the Connector Enforcement feature requires connections to the database to use the proxy or connector library.

While it is still possible to use password authentication with the auth proxy, it is less convenient than using IAM authentication, and IAM permissions are necessary anyways to issue the ephemeral certificate and establish a connection in the first place.

For customers running Cloud SQL with IAM Authentication, and running the pganalyze-collector using Cloud Run or GKE with Workload Identity, it is convenient to use IAM authentication, since it means that passwords will not need periodic rotation, and it also provides a stronger form of workload identity.

Approach 1: Cloud SQL Auth Proxy

When running in Cloud Run, Google has an option to enable connections to databases using its private address. I do not have experience using this method, but I'd expect that it effectively runs a cloud sql proxy inside the application container, so I'd expect it to support the same authentication (using the IAM service account name, e.g. pganalyze-collector@project.iam

When running in GKE, users can follow the Workload Identity steps to create a service account, and grant it access to:

  • Watch and pull logs from the Pub/Sub topic used for log insights (requires Pubsub Subscriber on the subscription; attaching to the subscription frequently also requires PubSub Viewer permissions, so I granted that to the account as well))
  • Connect to the database and authenticate with IAM (requires Cloud SQL Instance User and Cloud SQL Client permissions on the project and adding the service account to database instance access list)

Rough instructions:

  1. Create a Google service account

    gcloud iam service-accounts create GSA_NAME \
      --project=GSA_PROJECT
    
  2. Create an IAM binding on the project to grant Cloud SQL permissions for the service account - particularly Cloud SQL Instance User (required for IAM authentication to the database) and Cloud SQL Client (required to use the proxy/connector to issue certificates):

    gcloud projects add-iam-policy-binding PROJECT_ID \
      --member "serviceAccount:GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com" \
      --role "ROLE_NAME"
    

    where ROLE_NAME should be roles/cloudsql.instanceUser and then roles/cloudsql.client (you will need to run the above command twice)

    Documentation: https://cloud.google.com/sdk/gcloud/reference/projects/add-iam-policy-binding

  3. Create an IAM binding on the Pub/Sub Subscription object to grant Pub/Sub permissions for the service account - particularly Pub/Sub Subscriber and Pub/Sub Viewer.

    gcloud pubsub subscriptions add-iam-policy-binding SUBSCRIPTION \
      --member GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com" \
      --role "ROLE_NAME"
    

    where ROLE_NAME should be roles/pubsub.subscriber and then roles/pubsub.viewer (sometimes Subscriber is not necessary, depending on what the code is doing, but I have not checked the source nor tried it yet; Viewer seems safe enough to grant anyways)

Documentation: https://cloud.google.com/sdk/gcloud/reference/pubsub/subscriptions/add-iam-policy-binding

  1. Create a Kubernetes service account with an appropriate annotation to use this identity (note: this assumes you already did the steps of configuring the workload identity pool and enabling workload identity on your GKE cluster):

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: pganalyze
      annotations:
        iam.gke.io/gcp-service-account: "GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com"
  2. Create a Deployment using this service account, which includes the proxy as a sidecar; these are roughly the settings we're using:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: pganalyze
    spec:
      selector:
        matchLabels:
          app: pganalyze
      revisionHistoryLimit: 0
      replicas: 1
      template:
       metadata:
         labels:
           app: pganalyze
           version: v1
       spec:
         serviceAccountName: pganalyze
         containers:
           - image: quay.io/pganalyze/collector:stable
             name: pganalyze
             env:
               - name: GCP_PUBSUB_SUBSCRIPTION
                 value: projects/PROJECT_ID/subscriptions/SUBSCRIPTION_ID
               - name: GCP_CLOUDSQL_INSTANCE_ID
                 value: project-id:us-east1:instance
               - name: PGA_API_KEY
                 value: xxx
               - name: DB_USERNAME
                 value: GSA_NAME@GSA_PROJECT.iam
               - name: DB_HOST
                 value: 127.0.0.1
               - name: DB_PORT
                 value: "5432"
               - name: DB_NAME
                 value: postgres
             resources:
               requests:
                 cpu: 250m
                 memory: 250Mi
               limits:
                 cpu: 500m
                 memory: 500Mi
           - name: cloud-sql-proxy
             image: gcr.io/cloudsql-docker/gce-proxy:1.33.1-alpine
             command:
               - "/cloud_sql_proxy"
               - "-enable_iam_login"
               - "-ip_address_types=PRIVATE"
               - "-structured_logs"
               - "-term_timeout=10s"
               - "-use_http_health_check"
               - "-health_check_port=8090"
               - "-instances=project-id:us-east1:instance=tcp:5432"
             resources:
               requests:
                 cpu: 20m
                 memory: 32Mi
               limits:
                 cpu: 100m
                 memory: 128Mi
             livenessProbe:
               httpGet:
                 path: /liveness
                 port: 8090
               initialDelaySeconds: 10
               periodSeconds: 10
               timeoutSeconds: 5
               failureThreshold: 1
             readinessProbe:
               httpGet:
                 path: /readiness
                 port: 8090
               initialDelaySeconds: 5
               periodSeconds: 10
               timeoutSeconds: 5
               successThreshold: 1
               failureThreshold: 1
             startupProbe:
               httpGet:
                 path: /startup
                 port: 8090
               initialDelaySeconds: 5
               periodSeconds: 5
               timeoutSeconds: 5
               failureThreshold: 4

Approach 2: Cloud SQL Connector

The Cloud SQL Auth Proxy is written in Go, and Google uses a library to implement its functionality, which is also published separately as the cloud-sql-go-connector library.

The connector should be relatively straightforward to integrate, and allows applications to connect directly to the database without need for a sidecar, as described in the documentation: https://github.com/GoogleCloudPlatform/cloud-sql-go-connector#postgres

In this case, pganalyze-collector should be able to use the instance ID to discover the IP address to connect to, automatically handle issuing and rotating certificates, etc. This could be done using the existing GCP_CLOUDSQL_INSTANCE_ID variable that the collector already requires.

The proxy sidecar approach may still be preferable because it simplifies collection of metrics, but the preferred approach varies by company and team. Note also that we're using Cloud SQL Auth Proxy v1, but there's a v2 in preview, in which case the flags are slightly different.

@lfittl
Copy link
Member

lfittl commented Dec 21, 2022

@jawnsy Thanks for the detailed write-up, really appreciate this! (and I'll make a note to see if we can revise the docs to cover this more directly)

One question for my understanding:

The proxy sidecar approach may still be preferable because it simplifies collection of metrics, but the preferred approach varies by company and team. Note also that we're using Cloud SQL Auth Proxy v1, but there's a v2 in preview, in which case the flags are slightly different.

You mention "it simplifies collection of metrics" - could you clarify which metrics you were referring to?

@lfittl lfittl added the docs label Dec 21, 2022
@jawnsy
Copy link
Contributor Author

jawnsy commented Dec 21, 2022

Google has two versions of the auth proxy, and their new version (currently in preview) has support for some Prometheus metrics and tracing, intended for observability of the proxy itself (otherwise, failures can only been seen in logs, which are a lot harder to work with, especially because their earlier version emitted unstructured or semi-structured logs).

At Prefect, we are currently using cloud-sql-proxy v1 because it is GA, while v2 is currently in preview. The risk probably isn't significant because the proxy/library is a relatively thin wrapper that handles refreshing certificates periodically, but nonetheless, we stick to GA services for anything in the critical path of our production service. I don't have visibility to their planned v2 GA, but I'd expect it to be within the next 6-12 months.

Approach 2 has the benefit of being sidecarless/proxyless and if you don't care about observability of the metrics or don't use Prometheus, then that would work great. The Go library exposes the same metrics so with some work in pganalyze-collector, you can expose these same metrics too, possibly alongside your own metrics, but of course this would be Google-specific and more implementation effort on your end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants