Helm Release Job NotReady Status #1430

sp71 · 2023-09-15T17:06:48Z

Preflight checklist

I could not find a solution in the existing issues, docs, nor discussions.
I agree to follow this project's Code of Conduct.
I have read and am following this repository's Contribution Guidelines.
I have joined the Ory Community Slack.
I am signed up to the Ory Security Patch Newsletter.

Ory Network Project

No response

Describe the bug

When bringing up keto in terraform using the helm_release resource with autoMigration enabled, the job's pod is always set to NotReady despite the logs from the jobs pod indicating that the migration was applied correctly. I verified the database had all the changes committed to it correctly. Any ideas why the job's pod is always set to NotReady? I am using the cloudSQL proxy as the side car container.

Reproducing the bug

Steps to reproduce the behavior:

Apply terraform
See keto's job pod status set to NotReady

Relevant log output

Jobs Pod Logs

time=2023-09-10T12:12:40Z level=error msg=Unable to ping the database connection, retrying. audience=application error=map[message:failed to connect to `host=127.0.0.1 user=postgres database=`: dial error (dial tcp 127.0.0.1:5432: connect: connection refused)] service_name=Ory Keto service_version=v0.11.1-alpha.0
[POP] 2023/09/10 12:12:47 warn - One or more of connection details are specified in database.yml. Override them with values in URL.
time=2023-09-10T12:12:47Z level=info msg=No tracer configured - skipping tracing setup audience=application service_name=Ory Keto service_version=v0.11.1-alpha.0
Current status:
Version			Name					Status
20150100000001000000	networks				Pending
20201110175414000000	relationtuple				Pending
20201110175414000001	relationtuple				Pending
20210623162417000000	relationtuple				Pending
20210623162417000001	relationtuple				Pending
20210623162417000002	relationtuple				Pending
20210623162417000003	relationtuple				Pending
20210914134624000000	legacy-cleanup				Pending
20220217152313000000	nid_fk					Pending
20220512151000000000	indices					Pending
20220513200300000000	create-intermediary-uuid-table		Pending
20220513200400000000	create-uuid-mapping-table		Pending
20220513200400000001	uuid-mapping-remove-check		Pending
20220513200500000000	migrate-strings-to-uuids		Pending
20220513200600000000	drop-old-non-uuid-table			Pending
20220513200600000001	drop-old-non-uuid-table			Pending
20230228091200000000	add-on-delete-cascade-to-relationship	Pending
Applying migrations...
Successfully applied all migrations:
Version			Name					Status
20150100000001000000	networks				Applied
20201110175414000000	relationtuple				Applied
20201110175414000001	relationtuple				Applied
20210623162417000000	relationtuple				Applied
20210623162417000001	relationtuple				Applied
20210623162417000002	relationtuple				Applied
20210623162417000003	relationtuple				Applied
20210914134624000000	legacy-cleanup				Applied
20220217152313000000	nid_fk					Applied
20220512151000000000	indices					Applied
20220513200300000000	create-intermediary-uuid-table		Applied
20220513200400000000	create-uuid-mapping-table		Applied
20220513200400000001	uuid-mapping-remove-check		Applied
20220513200500000000	migrate-strings-to-uuids		Applied
20220513200600000000	drop-old-non-uuid-table			Applied
20220513200600000001	drop-old-non-uuid-table			Applied
20230228091200000000	add-on-delete-cascade-to-relationship	Applied



### Relevant configuration

```yml
resource "helm_release" "keto" {
  name       = "ory"
  repository = "https://k8s.ory.sh/helm/charts"
  chart      = "keto"

  values = [
    <<EOT
    serviceAccount:
      create: false
      name: ${module.service_account.value.id}
    job:
      serviceAccount:
        create: false
        name: ${module.service_account.value.id}
      extraContainers: |
        - name: cloud-sql-proxy
          image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.6.1
          imagePullPolicy: Always
          args:
          - "--structured-logs"
          - "--health-check"
          - "--http-address=0.0.0.0"
          - "--port=${local.sql_port}"
          - "--private-ip"
          - ${var.project_id}:${var.default_region}:${module.sql_db.name}
          securityContext:
            runAsNonRoot: true
            readOnlyRootFilesystem: true
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9090
            initialDelaySeconds: 0
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 2
          readinessProbe:
            httpGet:
              path: /readiness
              port: 9090
            initialDelaySeconds: 0
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 2
          startupProbe:
            httpGet:
              path: /startup
              port: 9090
            periodSeconds: 1
            timeoutSeconds: 5
            failureThreshold: 20
          resources:
            requests:
              memory: 128Mi
              cpu: 50m
            limits:
              memory: 512Mi
              cpu: 250m
    keto:
      automigration:
        enabled: true
      config:
        dsn: postgres://${local.db_username}:${random_password.password.result}@127.0.0.1:${local.sql_port}
    deployment:
      extraContainers: |
        - name: cloud-sql-proxy
          image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.6.1
          imagePullPolicy: Always
          args:
          - "--structured-logs"
          - "--health-check"
          - "--http-address=0.0.0.0"
          - "--port=${local.sql_port}"
          - "--private-ip"
          - ${var.project_id}:${var.default_region}:${module.sql_db.name}
          securityContext:
            runAsNonRoot: true
            readOnlyRootFilesystem: true
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9090
            initialDelaySeconds: 0
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 2
          readinessProbe:
            httpGet:
              path: /readiness
              port: 9090
            initialDelaySeconds: 0
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 2
          startupProbe:
            httpGet:
              path: /startup
              port: 9090
            periodSeconds: 1
            timeoutSeconds: 5
            failureThreshold: 20
          resources:
            requests:
              memory: 128Mi
              cpu: 50m
            limits:
              memory: 512Mi
              cpu: 250m
    EOT
  ]
}

Version

v0.11.1

On which operating system are you observing this issue?

None

In which environment are you deploying?

Kubernetes with Helm

Additional Context

CloudSQL PostgreSQL database
GCP

The text was updated successfully, but these errors were encountered:

sp71 added the bug Something is not working. label Sep 15, 2023

sp71 changed the title ~~Helm Release Side Car Container Issues~~ Helm Release Job NotReady Status Sep 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm Release Job NotReady Status #1430

Helm Release Job NotReady Status #1430

sp71 commented Sep 15, 2023 •

edited

Helm Release Job NotReady Status #1430

Helm Release Job NotReady Status #1430

Comments

sp71 commented Sep 15, 2023 • edited

Preflight checklist

Ory Network Project

Describe the bug

Reproducing the bug

Relevant log output

Version

On which operating system are you observing this issue?

In which environment are you deploying?

Additional Context

sp71 commented Sep 15, 2023 •

edited