Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Oathkeeper helm chart 0.41 causes 503 #1165

Open
4 of 5 tasks
WoodyWoodsta opened this issue Apr 15, 2024 · 8 comments
Open
4 of 5 tasks

Upgrade Oathkeeper helm chart 0.41 causes 503 #1165

WoodyWoodsta opened this issue Apr 15, 2024 · 8 comments
Labels
bug Something is not working.

Comments

@WoodyWoodsta
Copy link

Preflight checklist

Ory Network Project

N/A

Describe the bug

Upgrading the Oathkeeper chart to 0.41.0 causes oathkeeper to restart after 503 health check. I've not changed anything in the chart, and I don't use the secret (so it's secret.false for me).

Reproducing the bug

Upgrade to chart 0.41.0, with secret.false

Relevant log output

No response

Relevant configuration

oathkeeper:
  oathkeeper:
    config:
      log:
        level: debug
      authenticators:
        noop:
          enabled: true
        cookie_session:
          enabled: true
          config:
            check_session_url: http://ory-kratos-public:80/sessions/whoami
            preserve_path: true
            subject_from: "identity.id"
            only:
              - ory_kratos_session
        oauth2_client_credentials:
          enabled: true
          config:
            token_url: http://ory-hydra-public.ory.svc.cluster.local:4444/oauth2/token
            cache:
              enabled: true
      errors:
        handlers:
          redirect:
            enabled: true
            config:
              to: ***/login
          json:
            enabled: true
          www_authenticate:
            enabled: true
            when:
              - error:
                  - unauthorized
        fallback:
          - json
      authorizers:
        allow:
          enabled: true
        remote_json:
          enabled: true
          config:
            remote: http://ory-keto-read.ory.svc.cluster.local:80/relation-tuples/check
            payload: ""
      mutators:
        noop:
          enabled: true
      serve:
        proxy:
          trust_forwarded_headers: true
          timeout:
            write: 1m
            read: 1m
            idle: 1m
          cors:
            enabled: true
            allowed_origins:
              - ***
    managedAccessRules: false
  maester:
    enabled: false
  serviceMonitor:
    enabled: false
  deployment:
    resources:
      requests:
        cpu: 10m
        memory: 100Mi
      limits:
        memory: 1Gi

Version

0.41.0

On which operating system are you observing this issue?

None

In which environment are you deploying?

Kubernetes with Helm

Additional Context

No response

@WoodyWoodsta WoodyWoodsta added the bug Something is not working. label Apr 15, 2024
@Demonsthere
Copy link
Contributor

Hi there! Can you share/take a look at the logs of the container? We could see some more info there. The upgrade itself passed, but for tests we use a very simplistic set of configs so we could easily miss some corner case 😞

@WoodyWoodsta
Copy link
Author

Hey :)

I'd have to re-upgrade my running cluster to get the logs, but I didn't collect or post any because there was nothing related to the 503, even with log level on debug. If I get a chance I'll find a time appropriate to break my environment to get any log output.

@Demonsthere
Copy link
Contributor

😅 Yeah, obviously it's better not to break you env, but a failing health check should be logged as an event on the deployment/pod object, did you maybe take a look at that? Asking because right now I can't really do more then try to reproduce that upgrade with a minimal config as close to yours as possible 😞

@WoodyWoodsta
Copy link
Author

I've done a search through our monitoring history to see if I could get the kubernetes events around the time I attempted the upgrade, but with no luck. I'll see if I can reproduce and report back with the k8s events.

@sebboer
Copy link

sebboer commented Apr 29, 2024

i encounter the same error after updating to 0.41. It uses the same values.yaml as before the update.

{
  "audience": "application",
  "error": {
    "message": "The requested resource could not be found",
    "stack_trace": "stack trace could not be recovered from error type *healthx.swaggerNotReadyStatus"
  },
  "http_request": {
    "headers": {
      "accept": "*/*",
      "connection": "close",
      "user-agent": "kube-probe/1.29"
    },
    "host": "127.0.0.1",
    "method": "GET",
    "path": "/health/ready",
    "query": null,
    "remote": "<cluster-ip>:57138",
    "scheme": "http"
  },
  "http_response": {
    "status_code": 503
  },
  "level": "error",
  "msg": "An error occurred while handling a request",
  "service_name": "ORY Oathkeeper",
  "service_version": "v0.40.7",
  "time": "2024-04-29T12:28:50.400938908Z"
}

@sebboer
Copy link

sebboer commented Apr 29, 2024

Same error for 0.40.1. Update to 0.39.1 resolved into no error.
Coming from 0.38.0

@siredmar
Copy link

Same error here. I tried using the 0.42.0 helm chart that uses oathkeeper v0.40.7 by default. I got rid of the 503 errors by using the oathkeeper v0.39.4 with the 0.42.0 helm chart.

@Demonsthere
Copy link
Contributor

If this is a bug in Oathkeeper itself and not the chart or upgrade process, then maybe we should move this to the oathkeeper repo, as this could be regressions in the code itself, similar to #1161

@Demonsthere Demonsthere transferred this issue from ory/k8s May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working.
Projects
None yet
Development

No branches or pull requests

4 participants