Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller panick when stopping pipeline #292

Open
juchiast opened this issue Aug 30, 2023 · 1 comment
Open

Controller panick when stopping pipeline #292

juchiast opened this issue Aug 30, 2023 · 1 comment

Comments

@juchiast
Copy link

Deployed on AWS EKS

2023-08-29T15:36:10.402273Z ERROR arroyo_server_common: panicked at 'called `Result::unwrap()` on an `Err` value: Request ID: None Body: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>9GVR3VRECTWSPBJQ</RequestId><HostId>R0W2NZnbLfvwxlQ8htuN7ijJmQjeuKDDsKOkdhqz7WL55F5iCvpcdZgaXrKsfstuQKuzS9z9m40=</HostId></Error>', arroyo-state/src/parquet.rs:131:14 panic.file="arroyo-state/src/parquet.rs" panic.line=131 panic.column=14
kubectl describe deployment/arroyo-controller

Name:               arroyo-controller
Namespace:          default
CreationTimestamp:  Tue, 29 Aug 2023 11:03:20 +0700
Labels:             app=arroyo-controller
                    app.kubernetes.io/instance=arroyo
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=arroyo
                    app.kubernetes.io/version=0.5.1
                    helm.sh/chart=arroyo-0.5.1
Annotations:        deployment.kubernetes.io/revision: 2
                    meta.helm.sh/release-name: arroyo
                    meta.helm.sh/release-namespace: default
Selector:           app=arroyo-controller
Replicas:           1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:       Recreate
MinReadySeconds:    0
Pod Template:
  Labels:           app=arroyo-controller
                    app.kubernetes.io/instance=arroyo
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=arroyo
                    app.kubernetes.io/version=0.5.1
                    helm.sh/chart=arroyo-0.5.1
  Annotations:      prometheus.io/path: /metrics
                    prometheus.io/port: 9191
                    prometheus.io/scrape: true
  Service Account:  arroyo
  Containers:
   arroyo-controller:
    Image:       ghcr.io/arroyosystems/arroyo-services:0.5.1
    Ports:       9190/TCP, 9191/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      controller
    Requests:
      cpu:      1
      memory:   2Gi
    Liveness:   http-get http://:admin/status delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:admin/status delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      K8S_WORKER_SERVICE_ACCOUNT_NAME:  arroyo
      S3_BUCKET:                        serversstack-arroyoarroyobucket05f593c6-yadx1dmgn3rh
      S3_REGION:                        us-west-1
      DATABASE_HOST:                    arroyo-postgresql.default.svc.cluster.local
      DATABASE_PORT:                    5432
      DATABASE_NAME:                    arroyo
      DATABASE_USER:                    arroyo
      DATABASE_PASSWORD:                <set to the key 'password' in secret 'arroyo-postgresql'>  Optional: false
      CONTROLLER_ADDR:                  http://arroyo-controller:9190
      REMOTE_COMPILER_ENDPOINT:         http://arroyo-compiler:9000
      SCHEDULER:                        kubernetes
      K8S_NAMESPACE:                     (v1:metadata.namespace)
      K8S_WORKER_NAME:                  arroyo
      K8S_WORKER_LABELS:                helm.sh/chart: arroyo-0.5.1
                                        app.kubernetes.io/name: arroyo
                                        app.kubernetes.io/instance: arroyo
                                        app.kubernetes.io/version: "0.5.1"
                                        app.kubernetes.io/managed-by: Helm
      K8S_WORKER_ANNOTATIONS:           prometheus.io/path: /metrics
                                        prometheus.io/port: "6901"
                                        prometheus.io/scrape: "true"
      K8S_WORKER_IMAGE:                 ghcr.io/arroyosystems/arroyo-worker:0.5.1
      K8S_WORKER_IMAGE_PULL_POLICY:     IfNotPresent
      K8S_WORKER_RESOURCES:             limits: {}
                                        requests:
                                          cpu: 400m
                                          memory: 200Mi
      K8S_WORKER_SLOTS:
    Mounts:                             <none>
  Volumes:                              <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  arroyo-controller-794658d686 (0/0 replicas created)
NewReplicaSet:   arroyo-controller-5ff5c645d4 (1/1 replicas created)
Events:          <none>

Looking at S3 request logs, it didn't make the request with arroyo service account.

#	requester
1	arn:aws:sts::586927300535:assumed-role/ServersStack-EKSClusterclusterNodegroupClusterNode-QFIH2CJ3RDA2/i-0dced49d58c01a152

Successful request looked like this

#	requester
1	arn:aws:sts::586927300535:assumed-role/ServersStack-ArroyoArroyoSARole09B95647-I86TNKR6N78N/WebIdentitySession
@juchiast
Copy link
Author

juchiast commented Aug 30, 2023

Rusoto does not use K8S service role automatically
https://github.com/rusoto/rusoto/blob/master/AWS-CREDENTIALS.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant