Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RBAC access denied (403) when fetching visualization file from AWS S3 #124

Closed
2 tasks done
bobbeeke opened this issue Apr 24, 2024 · 15 comments
Closed
2 tasks done
Labels
kind/bug kind - things not working properly priority/needs-triage priority - needs to be triaged

Comments

@bobbeeke
Copy link

Checks

  • I have searched the existing issues.
  • This issue is NOT specific to the CLI. (If so, please open an issue on the CLI repo)

deployKF Version

v0.1.4

Kubernetes Version

Client Version: v1.30.0
Server Version: v1.28.6

Description

I try to setup a minimal pipeline which at one point should fetch visualization data from S3 (AWS) and show it in the visualization tab in the Kubeflow UI. The pipeline finishes, however the visualization tab shows: "There are no visualizations in this step."

When I inspect (network) my browser and click on this visualization tab I see this 403 (forbidden):

GET | https://<domain>/pipeline/artifacts/get?source=s3&namespace=<namespace>&bucket=<bucket>&key=artifacts/<namespace>/my-pipeline-bxfm8/2024/04/24/my-pipeline-bxfm8-1748704264/mlpipeline-ui-metadata.tgz

For some reason it fails to get permission to fetch the file from my AWS S3 bucket.
When I put the URL in my browser directly I get the same 403 and response:

RBAC: access denied

The file mlpipeline-ui-metadata.tgz exists on my bucket in the right path. So my pipeline is able to write to S3 with no problem apparently. I tried opening up my bucket IAM permissions further to allow everything but this also does not seem to help.

I'm a little bit stuck here. Not sure if this is some Istio related restriction or a AWS bucket restriction I am missing.
I followed the DeployKF docs about setting up S3 connectivity as good as possible.

Some guidance as where to look for would be appreciated.

Relevant Logs

No response

deployKF Values (Optional)

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: deploykf-app-of-apps
  namespace: argocd
  labels:
    app.kubernetes.io/name: deploykf-app-of-apps
    app.kubernetes.io/part-of: deploykf
spec:
  project: "default"
  source:
    repoURL: "https://github.com/deployKF/deployKF.git"
    targetRevision: "v0.1.4"
    path: "."
    plugin:
      name: "deploykf"
      parameters:
        - name: "source_version"
          string: "0.1.4"
        - name: "values_files"
          array:
            - "./sample-values.yaml"
        - name: "values"
          string: |
            deploykf_dependencies:
              cert_manager:
                enabled: false
                clusterIssuer:
                  enabled: false
                  issuerName: "letsencrypt-production-deploykf-gateway"                          
              istio:
                valuesOverrides:
                  istio-daemon:
                    pilot: 
                      resources:
                        requests:
                          cpu: 50m
                          memory: 100Mi
              kyverno: {}
            deploykf_core:
              deploykf_auth:
                dex:
                  staticPasswords:
                    - email: "user1@example.eu"
                      password:
                        value: "xxx"
                    - email: "user2@example.eu"
                      password:
                        value: "xxx"
                    - email: "user3@example.eu"
                      password:
                        value: "xxx"
              deploykf_istio_gateway:
                gateway:
                  hostname: kubeflow.example.eu
                  ports:
                    http: 80
                    https: 443
                gatewayService:
                  type: "ClusterIP"
              deploykf_profiles_generator:
                profileDefaults:
                  tools:
                    kubeflowPipelines:
                      objectStoreAuth:
                        existingSecret: "kubeflow-profiles-object-store-credentials"
                        existingSecretNamespace: "kubeflow"
                        existingSecretAccessKeyKey: "AWS_ACCESS_KEY_ID"
                        existingSecretSecretKeyKey: "AWS_SECRET_ACCESS_KEY"
                groups:
                  - id: example--admins
                    users:
                      - user1
                  - id: example--users
                    users:
                      - user2
                      - user3
                profiles: 
                  - name: example-test
                    members:
                      - group: example--admins
                        access:
                          role: edit
                          notebooksAccess: true
                      - group: example--users
                        access:
                          role: edit
                          notebooksAccess: true
                  - name: example-production
                    members:
                      - group: example--admins
                        access:
                          role: edit
                          notebooksAccess: true
                      - group: example--users
                        access:
                          role: view
                          notebooksAccess: true
                users:
                  - id: user1
                    email: "user1@example.eu"
                  - id: user2
                    email: "user2@example.eu"
                  - id: user3
                    email: "user3@example.eu"
            deploykf_opt:
              deploykf_minio:
                enabled: false
              deploykf_mysql:
                enabled: false
            kubeflow_tools:
              katib:
                mysql: 
                  useExternal: true
                  host: katib-mysql.kubeflow.svc.cluster.local
                  port: 3306
                  auth:
                    existingSecret: kitib-mysql-credentials
                    existingSecretUsernameKey: mysql-username-katib
                    existingSecretPasswordKey: mysql-password
              notebooks:
                spawnerFormDefaults:
                  affinityConfig:
                    value: "CPU node"
                    options:
                      - configKey: "CPU node"
                        displayName: "Deploy on dedicated CPU node"
                        affinity:
                          nodeAffinity:
                            requiredDuringSchedulingIgnoredDuringExecution:
                              nodeSelectorTerms:
                                - matchExpressions:
                                    - key: "node-role.kubernetes.io/cpu"
                                      operator: "Exists"
                                    - key: "node-role.kubernetes.io/kubeflow"
                                      operator: "Exists"
                      - configKey: "GPU node"
                        displayName: "Deploy on dedicated GPU node"
                        affinity:
                          nodeAffinity:
                            requiredDuringSchedulingIgnoredDuringExecution:
                              nodeSelectorTerms:
                                - matchExpressions:
                                    - key: "node-role.kubernetes.io/gpu"
                                      operator: "Exists"
                                    - key: "node-role.kubernetes.io/kubeflow"
                                      operator: "Exists"
                  configurations:
                    value:
                      - kubeflow-pipelines-api-token
                  cpu:
                    value: "0.5"
                    limitFactor: "none"
                  gpus: 
                    value:
                      vendor: "nvidia.com/gpu"
                      vendors:
                        - limitsKey: "nvidia.com/gpu"
                          uiName: "nvidia"
                  image: 
                    value: kubeflownotebookswg/jupyter-tensorflow-full:v1.7.0
                  memory: 
                    value: "1.0Gi"
                    limitFactor: "none"
                  tolerationGroup:
                    options:
                      - groupKey: "CPU node"
                        displayName: "Deploy on dedicated CPU node"
                        tolerations:
                          - key: "kubeflow/cpu"
                            operator: "Exists"
                            effect: "NoSchedule"
                      - groupKey: "GPU node"
                        displayName: "Deploy on dedicated GPU node"
                        tolerations:
                          - key: "nvidia.com/gpu"
                            operator: "Exists"
                            effect: "NoSchedule"
                    value: "CPU node"
                  workspaceVolume:
                    value: null
              pipelines:
                bucket:
                  name: my-bucket-example-eu
                  region: eu-central-1
                kfpV2:
                  minioFix: false
                mysql: 
                  useExternal: true
                  host: ml-pipeline-mysql.kubeflow.svc.cluster.local
                  port: 3306
                  auth:
                    existingSecret: ml-pipeline-mysql-credentials
                    existingSecretUsernameKey: mysql-username-ml-pipeline
                    existingSecretPasswordKey: mysql-password
                mysqlDatabases:
                  cacheDatabase: kfp_cache
                  metadataDatabase: kfp_metadata
                  pipelinesDatabase: kfp_pipelines
                objectStore:
                  useExternal: true
                  host: s3.amazonaws.com
                  port: ""
                  useSSL: true
                  auth:
                    existingSecret: "kubeflow-pipelines-object-store-credentials"
                    existingSecretAccessKeyKey: "AWS_ACCESS_KEY_ID"
                    existingSecretSecretKeyKey: "AWS_SECRET_ACCESS_KEY"
                profileResourceGeneration:
                  kfpApiTokenPodDefault: true
  destination:
    server: "https://kubernetes.default.svc"
    namespace: "argocd"
@bobbeeke bobbeeke added kind/bug kind - things not working properly priority/needs-triage priority - needs to be triaged labels Apr 24, 2024
@thesuperzapper
Copy link
Member

@bobbeeke can you share a very basic Pipeline that creates this error?

I want to make sure this is fixed before releasing Kubeflow Pipelines 2.1 support in #122

@bobbeeke
Copy link
Author

Sure.
Here are our notebook steps which lead to the problem in our environment (verified).

It's based on a minimal example from:
https://www.kubeflow.org/docs/components/pipelines/v1/sdk/output-viewer/#markdown-1

!pip install kfp==1.8.22

import kfp 
print(kfp.__version__)

from kfp import dsl
import kubernetes

def markdown_vis(mlpipeline_ui_metadata_path: kfp.components.OutputPath()):
  import json
  metadata = {
    'outputs' : [
    {
      'storage': 'inline',
      'source': 'Inline Markdown - Hello Metrics World',
      'type': 'markdown',
    }]
  }

  with open(mlpipeline_ui_metadata_path, 'w') as metadata_file:
    json.dump(metadata, metadata_file)
            
component_markdown_vis = kfp.components.create_component_from_func(markdown_vis)

@dsl.pipeline(  # descriptor
   name='testing out metrics and their visualizations',
   description='testing out metrics and their visualizations'
)
def baseline_metrics(): # ptyhon function that holds content of the pipeline function

    task1 =component_markdown_vis() # here pass the component as the only pipeline step
    task1.execution_options.caching_strategy.max_cache_staleness = "P0D"

kfp_client = kfp.Client()

kfp_client.create_run_from_pipeline_func(
    baseline_metrics,
    arguments={}
    )

@bobbeeke
Copy link
Author

I did not mention it in my earlier description but I also get 403's after clicking on the Input/Output tab.

GET | https://<domain>/pipeline/artifacts/get?source=s3&namespace=<namespace>&peek=256&bucket=<bucket>&key=artifacts/<namespace>/testing-out-metrics-and-their-visualizations-lmdxx/2024/04/25/testing-out-metrics-and-their-visualizations-lmdxx-3138565765/mlpipeline-ui-metadata.tgz

GET | https://<domain>/pipeline/artifacts/get?source=s3&namespace=<namespace>&peek=256&bucket=<bucket>&key=artifacts/<namespace>/testing-out-metrics-and-their-visualizations-lmdxx/2024/04/25/testing-out-metrics-and-their-visualizations-lmdxx-3138565765/main.log

So the fetching problem is not limited to the visualization tab.

@bobbeeke
Copy link
Author

bobbeeke commented Apr 25, 2024

Can confirm that I narrowed the issue down to Istio.
I added this in my namespace and the problem is solved.

(!) ONLY for temporary testing purposes in non-production:

## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
## WARNING DO NOT USE, DISABLES AUTH ##
## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: allow-all
  namespace: my-namespace
spec:
 rules:
 - {}

Next questions are:

  • Did I miss something in my setup following the deploykf docs?
  • How is this issue solved in a safe way? I guess this policy is not safe for running in production.

The idea came from:
tensorflow/tfx#3893

@thesuperzapper
Copy link
Member

@bobbeeke first, you should NOT run any AuthorizationPolicy which looks like that!!!!
It will turn off deployKF's authentication.

I just tested with your specific pipeline on S3, and everything works properly, so can we try the following things so we can debug:

  1. Remove any custom AuthorizationPolicy resources you have
  2. Run a sync with the latest sync_argocd_apps.sh script (this is important, if you have forgotten to prune)
  3. Verify that your two Secrets actually contain valid AWS Keys:
    • kubeflow-pipelines-object-store-credentials
    • kubeflow-profiles-object-store-credentials
  4. Check the logs of the ml-pipeline-ui-artifact in your profile namespace while trying to access something that fails

@bobbeeke
Copy link
Author

Did what you asked but still encounter the same issue.

First I performed your steps -> Issue still existed
Then to be sure I uninstalled my whole setup and deployed it again from scratch -> Issue still exists

  • Checked secrets -> fine
  • Ran the pipeline again -> Same issue with 403 response
  • Collected all logging that may be of interest:

Logging from startup pod till after the pipeline run:

Namespace: example-test
Pod: ml-pipeline-ui-artifact-6957c87b97-6pfgm
Container: ml-pipeline-ui-artifact

{
  argo: {
    archiveArtifactory: 'minio',
    archiveBucketName: 'mlpipeline',
    archiveLogs: false,
    archivePrefix: 'logs'
  },
  artifacts: 'Artifacts config contains credentials, so it is omitted',
  metadata: { envoyService: { host: 'localhost', port: '9090' } },
  pipeline: { host: 'localhost', port: '3001' },
  server: {
    apiVersionPrefix: 'apis/v1beta1',
    basePath: '/pipeline',
    deployment: 'NOT_SPECIFIED',
    hideSideNav: false,
    port: 3000,
    staticDir: '/client'
  },
  viewer: {
    tensorboard: {
      podTemplateSpec: undefined,
      tfImageName: 'tensorflow/tensorflow'
    }
  },
  visualizations: { allowCustomVisualizations: false },
  gkeMetadata: { disabled: false },
  auth: {
    enabled: false,
    kubeflowUserIdHeader: 'x-goog-authenticated-user-email',
    kubeflowUserIdPrefix: 'accounts.google.com:'
  }
}
[HPM] Proxy created: /  ->  http://localhost:9090
[HPM] Proxy created: /  ->  http://127.0.0.1
[HPM] Subscribed to http-proxy events:  [ 'error', 'close' ]
[HPM] Proxy created: /  ->  http://127.0.0.1
[HPM] Subscribed to http-proxy events:  [ 'error', 'close' ]
[HPM] Proxy created: /  ->  http://localhost:3001
[HPM] Subscribed to http-proxy events:  [ 'proxyReq', 'error', 'close' ]
[HPM] Proxy created: /  ->  http://localhost:3001
[HPM] Subscribed to http-proxy events:  [ 'proxyReq', 'error', 'close' ]
(node:1) Warning: Accessing non-existent property 'cat' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
(node:1) Warning: Accessing non-existent property 'cd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'chmod' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'cp' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'dirs' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'pushd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'popd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'echo' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'tempdir' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'pwd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'exec' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'ls' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'find' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'grep' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'head' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'ln' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'mkdir' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'rm' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'mv' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'sed' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'set' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'sort' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'tail' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'test' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'to' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'toEnd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'touch' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'uniq' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'which' of module exports inside circular dependency
Server listening at http://localhost:3000

Namespace: example-test
Pod: ml-pipeline-ui-artifact-6957c87b97-6pfgm
Container: istio-proxy

2024-04-26T09:02:26.166814Z     info    FLAG: --concurrency="2"
2024-04-26T09:02:26.166885Z     info    FLAG: --domain="<profile-namespace>.svc.cluster.local"
2024-04-26T09:02:26.166897Z     info    FLAG: --help="false"
2024-04-26T09:02:26.166903Z     info    FLAG: --log_as_json="false"
2024-04-26T09:02:26.166915Z     info    FLAG: --log_caller=""
2024-04-26T09:02:26.166922Z     info    FLAG: --log_output_level="default:info"
2024-04-26T09:02:26.166928Z     info    FLAG: --log_rotate=""
2024-04-26T09:02:26.166934Z     info    FLAG: --log_rotate_max_age="30"
2024-04-26T09:02:26.166947Z     info    FLAG: --log_rotate_max_backups="1000"
2024-04-26T09:02:26.166953Z     info    FLAG: --log_rotate_max_size="104857600"
2024-04-26T09:02:26.166959Z     info    FLAG: --log_stacktrace_level="default:none"
2024-04-26T09:02:26.166980Z     info    FLAG: --log_target="[stdout]"
2024-04-26T09:02:26.166987Z     info    FLAG: --meshConfig="./etc/istio/config/mesh"
2024-04-26T09:02:26.166993Z     info    FLAG: --outlierLogPath=""
2024-04-26T09:02:26.167005Z     info    FLAG: --proxyComponentLogLevel="misc:error"
2024-04-26T09:02:26.167011Z     info    FLAG: --proxyLogLevel="warning"
2024-04-26T09:02:26.167018Z     info    FLAG: --serviceCluster="istio-proxy"
2024-04-26T09:02:26.167029Z     info    FLAG: --stsPort="0"
2024-04-26T09:02:26.167036Z     info    FLAG: --templateFile=""
2024-04-26T09:02:26.167042Z     info    FLAG: --tokenManagerPlugin="GoogleTokenExchange"
2024-04-26T09:02:26.167055Z     info    FLAG: --vklog="0"
2024-04-26T09:02:26.167062Z     info    Version 1.17.3-61a081630d1bcc705e22b674e7f2fab7be3f16df-Clean
2024-04-26T09:02:26.167342Z     warn    failed running ulimit command: 
2024-04-26T09:02:26.167692Z     info    Proxy role      ips=[100.96.1.174] type=sidecar id=ml-pipeline-ui-artifact-6957c87b97-6pfgm.<profile-namespace> domain=<profile-namespace>.svc.cluster.local
2024-04-26T09:02:26.167848Z     info    Apply proxy config from env {"proxyMetadata":{"ISTIO_META_DNS_AUTO_ALLOCATE":"true","ISTIO_META_DNS_CAPTURE":"true"},"holdApplicationUntilProxyStarts":true}

2024-04-26T09:02:26.183108Z     info    Effective config: binaryPath: /usr/local/bin/envoy
concurrency: 2
configPath: ./etc/istio/proxy
controlPlaneAuthPolicy: MUTUAL_TLS
discoveryAddress: istiod.istio-system.svc:15012
drainDuration: 45s
holdApplicationUntilProxyStarts: true
proxyAdminPort: 15000
proxyMetadata:
  ISTIO_META_DNS_AUTO_ALLOCATE: "true"
  ISTIO_META_DNS_CAPTURE: "true"
serviceCluster: istio-proxy
statNameLength: 189
statusPort: 15020
terminationDrainDuration: 5s
tracing:
  zipkin:
    address: zipkin.istio-system:9411

2024-04-26T09:02:26.183148Z     info    JWT policy is third-party-jwt
2024-04-26T09:02:26.183155Z     info    using credential fetcher of JWT type in cluster.local trust domain
2024-04-26T09:02:26.193841Z     info    dns     Starting local udp DNS server on 127.0.0.1:15053
2024-04-26T09:02:26.194129Z     info    dns     Starting local tcp DNS server on 127.0.0.1:15053
2024-04-26T09:02:26.194177Z     info    Workload SDS socket not found. Starting Istio SDS Server
2024-04-26T09:02:26.194202Z     info    CA Endpoint istiod.istio-system.svc:15012, provider Citadel
2024-04-26T09:02:26.194242Z     info    Using CA istiod.istio-system.svc:15012 cert with certs: var/run/secrets/istio/root-cert.pem
2024-04-26T09:02:26.198944Z     info    Opening status port 15020
2024-04-26T09:02:26.247316Z     info    ads     All caches have been synced up in 93.327114ms, marking server ready
2024-04-26T09:02:26.248048Z     info    xdsproxy        Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2024-04-26T09:02:26.250874Z     info    sds     Starting SDS grpc server
2024-04-26T09:02:26.260112Z     info    Pilot SAN: [istiod.istio-system.svc]
2024-04-26T09:02:26.264193Z     info    starting Http service at 127.0.0.1:15004
2024-04-26T09:02:26.278003Z     info    Starting proxy agent
2024-04-26T09:02:26.278237Z     info    starting
2024-04-26T09:02:26.278298Z     info    Envoy command: [-c etc/istio/proxy/envoy-rev.json --drain-time-s 45 --drain-strategy immediate --local-address-ip-version v4 --file-flush-interval-msec 1000 --disable-hot-restart --allow-unknown-static-fields --log-format %Y-%m-%dT%T.%fZ   %l      envoy %n %g:%#  %v      thread=%t -l warning --component-log-level misc:error --concurrency 2]
2024-04-26T09:02:27.248457Z     info    xdsproxy        connected to upstream XDS server: istiod.istio-system.svc:15012
2024-04-26T09:02:27.914924Z     info    cache   generated new workload certificate      latency=1.665180244s ttl=23h59m59.085094061s
2024-04-26T09:02:27.915473Z     info    cache   Root cert has changed, start rotating root cert
2024-04-26T09:02:27.915755Z     info    ads     XDS: Incremental Pushing:0 ConnectedEndpoints:0 Version:
2024-04-26T09:02:27.916230Z     info    cache   returned workload trust anchor from cache       ttl=23h59m59.083777765s
2024-04-26T09:02:28.001666Z     info    ads     ADS: new connection for node:ml-pipeline-ui-artifact-6957c87b97-6pfgm.<profile-namespace>-1
2024-04-26T09:02:28.002704Z     info    cache   returned workload certificate from cache        ttl=23h59m58.997305435s
2024-04-26T09:02:28.003298Z     info    ads     SDS: PUSH request for node:ml-pipeline-ui-artifact-6957c87b97-6pfgm.<profile-namespace> resources:1 size:4.0kB resource:default
2024-04-26T09:02:28.005696Z     info    ads     ADS: new connection for node:ml-pipeline-ui-artifact-6957c87b97-6pfgm.<profile-namespace>-2
2024-04-26T09:02:28.006390Z     info    cache   returned workload trust anchor from cache       ttl=23h59m58.993619766s
2024-04-26T09:02:28.006901Z     info    ads     SDS: PUSH request for node:ml-pipeline-ui-artifact-6957c87b97-6pfgm.<profile-namespace> resources:1 size:1.1kB resource:ROOTCA
2024-04-26T09:02:29.019693Z     info    Readiness succeeded in 2.907731814s
2024-04-26T09:02:29.020295Z     info    Envoy proxy is ready
2024-04-26T09:33:16.621913Z     info    xdsproxy        connected to upstream XDS server: istiod.istio-system.svc:15012
2024-04-26T09:38:31.992878Z     info    xdsproxy        connected to upstream XDS server: istiod.istio-system.svc:15012

When I generate the 403 responses by clicking on the visualizations tab I get these logs:

Namespace: kubeflow
Pod: ml-pipeline-7c98f8cc96-jgdgk

I0426 10:37:31.479597       7 interceptor.go:29] /api.RunService/GetRun handler starting
I0426 10:37:31.511265       7 util.go:360] Getting user identity...
I0426 10:37:31.511312       7 util.go:370] User: user1@example.eu, ResourceAttributes: &ResourceAttributes{Namespace:example-test,Verb:get,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:testing-out-metrics-and-their-visualizations-znzvh,}
I0426 10:37:31.511344       7 util.go:371] Authorizing request...
I0426 10:37:31.516078       7 util.go:378] Authorized user 'user1@example.eu': &ResourceAttributes{Namespace:example-test,Verb:get,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:testing-out-metrics-and-their-visualizations-znzvh,}
I0426 10:37:31.521141       7 interceptor.go:37] /api.RunService/GetRun handler finished

Namespace: kubeflow
Pod: ml-pipeline-ui-84846c849d-9z2c2

GET /pipeline/apis/v1beta1/runs/0d7c8596-2b2d-443b-9d9d-19698989400b
Proxied request:  /apis/v1beta1/runs/0d7c8596-2b2d-443b-9d9d-19698989400b
GET /pipeline/artifacts/get?source=s3&namespace=example-test&bucket=my-bucket&key=artifacts%2Fexample-test%2Ftesting-out-metrics-and-their-visualizations-znzvh%2F2024%2F04%2F26%2Ftesting-out-metrics-and-their-visualizations-znzvh-3230888702%2Fmlpipeline-ui-metadata.tgz
[HPM] Router new target: /artifacts -> "http://ml-pipeline-ui-artifact.example-test:80"
Proxied artifact request:  /pipeline/artifacts/get?source=s3&bucket=my-bucket&key=artifacts%2Fexample-test%2Ftesting-out-metrics-and-their-visualizations-znzvh%2F2024%2F04%2F26%2Ftesting-out-metrics-and-their-visualizations-znzvh-3230888702%2Fmlpipeline-ui-metadata.tgz

And from my nginx ingress controller:

169.150.196.10 - - [26/Apr/2024:10:37:31 +0000] "GET /pipeline/apis/v1beta1/runs/0d7c8596-2b2d-443b-9d9d-19698989400b HTTP/2.0" 200 13349 "https://kubeflow.example.eu/pipeline/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0" 1683 0.062 [deploykf-istio-gateway-deploykf-gateway-https] [] 100.96.1.253:443 13362 0.063 200 1fe92503e9f5e71c554d0318d634a958
169.150.196.10 - - [26/Apr/2024:10:37:31 +0000] "GET /pipeline/artifacts/get?source=s3&namespace=example-test&bucket=my-bucket&key=artifacts%example-test%2Ftesting-out-metrics-and-their-visualizations-znzvh%2F2024%2F04%2F26%2Ftesting-out-metrics-and-their-visualizations-znzvh-3230888702%2Fmlpipeline-ui-metadata.tgz HTTP/2.0" 403 19 "https://kubeflow.example.eu/pipeline/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0" 1837 0.019 [deploykf-istio-gateway-deploykf-gateway-https] [] 100.96.1.253:443 19 0.019 403 d5901e61c4c05e28bdf092ceb1301b5a
169.150.196.10 - - [26/Apr/2024:10:37:31 +0000] "POST /ml_metadata.MetadataStoreService/GetEventsByExecutionIDs HTTP/2.0" 200 102 "https://kubeflow.example.eu/pipeline/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0" 1691 0.109 [deploykf-istio-gateway-deploykf-gateway-https] [] 100.96.1.253:443 113 0.108 200 c9c1912837870f22c72f2a4f0292c662
169.150.196.10 - - [26/Apr/2024:10:37:31 +0000] "POST /ml_metadata.MetadataStoreService/GetArtifactTypes HTTP/2.0" 200 134 "https://kubeflow.example.eu/pipeline/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0" 1682 0.139 [deploykf-istio-gateway-deploykf-gateway-https] [] 100.96.1.253:443 145 0.139 200 26a539adc4152ec301e2635525e1f167
169.150.196.10 - - [26/Apr/2024:10:37:31 +0000] "POST /ml_metadata.MetadataStoreService/GetArtifactsByID HTTP/2.0" 200 1281 "https://kubeflow.example.eu/pipeline/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0" 1689 0.146 [deploykf-istio-gateway-deploykf-gateway-https] [] 100.96.1.253:443 1293 0.146 200 ca3642f2cfb511630cc7023e66b272f2

@bobbeeke
Copy link
Author

Not sure if the issue is probably related to my nginx ingress setup in front of the istio-gateway but here is the definition:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-production
    nginx.ingress.kubernetes.io/proxy-body-size: 100m
    nginx.ingress.kubernetes.io/backend-protocol: HTTPS
    nginx.ingress.kubernetes.io/proxy-ssl-name: "kubeflow.example.eu"
    nginx.ingress.kubernetes.io/proxy-ssl-server-name: "on"
    nginx.ingress.kubernetes.io/proxy-ssl-secret: "deploykf-istio-gateway/deploykf-nginx-gateway-tls"
  name: deploykf-nginx-gateway
  namespace: deploykf-istio-gateway
spec:
  ingressClassName: nginx
  rules:
    - host: "kubeflow.example.eu"
      http:
        paths:
          - backend:
              service:
                name: deploykf-gateway
                port:
                  name: https
            path: /
            pathType: Prefix 
    - host: "*.kubeflow.example.eu"
      http:
        paths:
          - backend:
              service:
                name: deploykf-gateway
                port:
                  name: https
            path: /
            pathType: Prefix
  tls:
    - hosts:
      - kubeflow.example.eu
      secretName: deploykf-nginx-gateway-tls

@thesuperzapper
Copy link
Member

@bobbeeke because you said that creating the following authorization policy fixed the issue:

## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
## WARNING DO NOT USE, DISABLES AUTH ##
## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: allow-all
  ## NOTE: I assume you meant `example-test` here?
  namespace: my-namespace
spec:
 rules:
 - {}

You must have made some custom AuthorizationPolicy that prevents requests from ml-pipeline-ui (in the kubeflow namespace) to ml-pipeline-ui-artifact (in the example-test namespace).

The places I would check are:

  • Check for any extra AuthorizationPolicies in the example-test namespace, there should only be 6:
    • ml-pipeline-visualizationserver
    • ns-owner-access-istio
    • ns-owner-access-istio--override
    • user-user1-example-eu-clusterrole-edit
    • user-user2-example-eu-clusterrole-edit
    • user-user3-example-eu-clusterrole-edit
  • Check for any extra AuthorizationPolicies in the istio-system namespace, there should be 0:
    • NONE

@bobbeeke
Copy link
Author

bobbeeke commented Apr 27, 2024

"## NOTE: I assume you meant example-test here?" -> Yes, sorry

I certainly did not mess with authorization policies except for my one-time "allow-all" test to exclude other possible causes. Also performed a full uninstall and a clean deployment.

I will dive into Istio authorization and troubleshooting (little experience till now) myself to find out what is causing this in my environment. Thanks for the pointers so far, they help a lot!

Current status:

If I add this test policy to my profile namespace I can fetch from S3:

## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
## WARNING DO NOT USE, MESSES WITH AUTH ##
## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: test-policy-bob
  namespace: example-test
spec:
  rules:
  - to:
    - operation:
        methods: ["GET"]
        paths: ["/pipeline/artifacts/*"]
  selector:
    matchLabels:
      app: ml-pipeline-ui-artifact

So as far as I understand a GET request on ml-pipeline-ui-artifact is blocked for some unknown reason.
If I for instance change "GET" to "POST" in the above policy I get a 403 again.

I still miss some knowledge right now on this topic but will try to learn and narrow it down further. I will share findings here.

@thesuperzapper
Copy link
Member

@bobbeeke we can probably work it out if you give the full YAML of all the AuthorizationPolicies in the example-test namespace (if you sanitize them, please make sure you use the sam sanitized value when the source value is the same).

@bobbeeke
Copy link
Author

bobbeeke commented Apr 27, 2024

Sure, here they are:

---
- apiVersion: security.istio.io/v1
  kind: AuthorizationPolicy
  metadata:
    annotations:
      argocd.argoproj.io/compare-options: ""
      argocd.argoproj.io/sync-options: ""
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"security.istio.io/v1beta1","kind":"AuthorizationPolicy","metadata":{"annotations":{"argocd.argoproj.io/compare-options":"","argocd.argoproj.io/sync-options":""},"labels":{"app.kubernetes.io/instance":"kf-tools--pipelines"},"name":"ml-pipeline-visualizationserver","namespace":"example-test"},"spec":{"rules":[{"from":[{"source":{"principals":["cluster.local/ns/kubeflow/sa/ml-pipeline"]}}]}],"selector":{"matchLabels":{"app":"ml-pipeline-visualizationserver"}}}}
    creationTimestamp: "2024-04-26T09:02:37Z"
    generation: 14
    labels:
      app.kubernetes.io/instance: kf-tools--pipelines
    name: ml-pipeline-visualizationserver
    namespace: example-test
    resourceVersion: "42385275"
    uid: 3920ccd6-74f2-429d-83b6-16652b298dc6
  spec:
    rules:
    - from:
      - source:
          principals:
          - cluster.local/ns/kubeflow/sa/ml-pipeline
    selector:
      matchLabels:
        app: ml-pipeline-visualizationserver
---
- apiVersion: security.istio.io/v1
  kind: AuthorizationPolicy
  metadata:
    annotations:
      role: admin
      user: admin@example.com
    creationTimestamp: "2024-04-26T08:58:10Z"
    generation: 7
    name: ns-owner-access-istio
    namespace: example-test
    ownerReferences:
    - apiVersion: kubeflow.org/v1
      blockOwnerDeletion: true
      controller: true
      kind: Profile
      name: example-test
      uid: 9e705b85-91d9-4c56-861a-a739dae2b1c2
    resourceVersion: "42051914"
    uid: ddae1583-7faf-44a4-b69c-80ec21f887b0
  spec:
    rules:
    - when:
      - key: request.headers[kubeflow-userid]
        values:
        - admin@example.com
    - when:
      - key: source.namespace
        values:
        - example-test
    - to:
      - operation:
          paths:
          - /healthz
          - /metrics
          - /wait-for-drain
    - from:
      - source:
          principals:
          - cluster.local/ns/kubeflow/sa/notebook-controller-service-account
      to:
      - operation:
          methods:
          - GET
          paths:
          - '*/api/kernels'
---
- apiVersion: security.istio.io/v1
  kind: AuthorizationPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"security.istio.io/v1beta1","kind":"AuthorizationPolicy","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"dkf-core--deploykf-profiles-generator","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"deploykf-profiles-generator","helm.sh/chart":"deploykf-profiles-generator-1.0.0"},"name":"ns-owner-access-istio--override","namespace":"example-test"},"spec":{"action":"DENY","rules":[{"from":[{"source":{"notPrincipals":["cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway","cluster.local/ns/kubeflow/sa/ml-pipeline-ui"]}}],"when":[{"key":"request.headers[kubeflow-userid]","values":["admin@example.com"]}]}]}}
    creationTimestamp: "2024-04-26T08:58:12Z"
    generation: 5
    labels:
      app.kubernetes.io/instance: dkf-core--deploykf-profiles-generator
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: deploykf-profiles-generator
      helm.sh/chart: deploykf-profiles-generator-1.0.0
    name: ns-owner-access-istio--override
    namespace: example-test
    resourceVersion: "42389285"
    uid: cb8ea6a5-ca07-4809-9583-65a85ac7af73
  spec:
    action: DENY
    rules:
    - from:
      - source:
          notPrincipals:
          - cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway
          - cluster.local/ns/kubeflow/sa/ml-pipeline-ui
      when:
      - key: request.headers[kubeflow-userid]
        values:
        - admin@example.com
---
- apiVersion: security.istio.io/v1
  kind: AuthorizationPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"security.istio.io/v1beta1","kind":"AuthorizationPolicy","metadata":{"annotations":{"role":"edit","user":"user1@example.eu"},"labels":{"app.kubernetes.io/instance":"dkf-core--deploykf-profiles-generator","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"deploykf-profiles-generator","helm.sh/chart":"deploykf-profiles-generator-1.0.0"},"name":"user-user1-example-eu-clusterrole-edit","namespace":"example-test"},"spec":{"rules":[{"from":[{"source":{"principals":["cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway","cluster.local/ns/kubeflow/sa/ml-pipeline-ui"]}}],"when":[{"key":"request.headers[kubeflow-userid]","values":["user1@example.eu"]}]}]}}
      role: edit
      user: user1@example.eu
    creationTimestamp: "2024-04-26T08:58:12Z"
    generation: 30
    labels:
      app.kubernetes.io/instance: dkf-core--deploykf-profiles-generator
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: deploykf-profiles-generator
      helm.sh/chart: deploykf-profiles-generator-1.0.0
    name: user-user1-example-eu-clusterrole-edit
    namespace: example-test
    resourceVersion: "42429864"
    uid: 88d3c64e-7b24-4b6b-a727-60d5d1a9553b
  spec:
    rules:
    - from:
      - source:
          principals:
          - cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway
          - cluster.local/ns/kubeflow/sa/ml-pipeline-ui
      when:
      - key: request.headers[kubeflow-userid]
        values:
        - user1@example.eu
---
- apiVersion: security.istio.io/v1
  kind: AuthorizationPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"security.istio.io/v1beta1","kind":"AuthorizationPolicy","metadata":{"annotations":{"role":"edit","user":"user2@example.eu"},"labels":{"app.kubernetes.io/instance":"dkf-core--deploykf-profiles-generator","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"deploykf-profiles-generator","helm.sh/chart":"deploykf-profiles-generator-1.0.0"},"name":"user-user2-example-eu-clusterrole-edit","namespace":"example-test"},"spec":{"rules":[{"from":[{"source":{"principals":["cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway","cluster.local/ns/kubeflow/sa/ml-pipeline-ui"]}}],"when":[{"key":"request.headers[kubeflow-userid]","values":["user2@example.eu"]}]}]}}
      role: edit
      user: user2@example.eu
    creationTimestamp: "2024-04-26T08:58:12Z"
    generation: 1
    labels:
      app.kubernetes.io/instance: dkf-core--deploykf-profiles-generator
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: deploykf-profiles-generator
      helm.sh/chart: deploykf-profiles-generator-1.0.0
    name: user-user2-example-eu-clusterrole-edit
    namespace: example-test
    resourceVersion: "41640621"
    uid: 2fadbe02-66cf-4de8-b145-3fea9a1e8b4b
  spec:
    rules:
    - from:
      - source:
          principals:
          - cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway
          - cluster.local/ns/kubeflow/sa/ml-pipeline-ui
      when:
      - key: request.headers[kubeflow-userid]
        values:
        - user2@example.eu
---
- apiVersion: security.istio.io/v1
  kind: AuthorizationPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"security.istio.io/v1beta1","kind":"AuthorizationPolicy","metadata":{"annotations":{"role":"edit","user":"user3@example.eu"},"labels":{"app.kubernetes.io/instance":"dkf-core--deploykf-profiles-generator","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"deploykf-profiles-generator","helm.sh/chart":"deploykf-profiles-generator-1.0.0"},"name":"user-user3-example-eu-clusterrole-edit","namespace":"example-test"},"spec":{"rules":[{"from":[{"source":{"principals":["cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway","cluster.local/ns/kubeflow/sa/ml-pipeline-ui"]}}],"when":[{"key":"request.headers[kubeflow-userid]","values":["user3@example.eu"]}]}]}}
      role: edit
      user: user3@example.eu
    creationTimestamp: "2024-04-26T08:58:12Z"
    generation: 1
    labels:
      app.kubernetes.io/instance: dkf-core--deploykf-profiles-generator
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: deploykf-profiles-generator
      helm.sh/chart: deploykf-profiles-generator-1.0.0
    name: user-user3-example-eu-clusterrole-edit
    namespace: example-test
    resourceVersion: "41640623"
    uid: c5c19f93-fcd5-468c-9f49-a99d31607ffa
  spec:
    rules:
    - from:
      - source:
          principals:
          - cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway
          - cluster.local/ns/kubeflow/sa/ml-pipeline-ui
      when:
      - key: request.headers[kubeflow-userid]
        values:
        - user3@example.eu

@thesuperzapper
Copy link
Member

@bobbeeke

  1. Which email account are you logging in as?
  2. Are there any AuthorizationPolicies in the istio-system namespace?
    • If so, please post their YAML.
  3. Does the ml-pipeline-ui-artifact-xxxx pod (in example-test) have an istio sidecar container?
    • If not, delete it and see if the new pod has a sidecar.
  4. Does the ml-pipeline-ui-xxxx pod (in kubeflow) have an istio sidecar container?
  5. Does the central-dashboard-xxxx pod (in deploykf-dashboard) have an istio sidecar container?

@bobbeeke
Copy link
Author

bobbeeke commented May 13, 2024

Ok, I had some time again to have a look.

1: My own existing email: user1@example.eu
2: No
3: Yes: istio-proxy
4: Yes: istio-proxy
5: Yes: istio-proxy

I also tried:

  • Upgrading Istio -> no luck, still same issue.
  • Get rid of my ingress-nginx proxy and directly access the cluster through Istio gateway -> Managed to do this but no luck, still same issue.

After some more trying I think I came something closer to the root of the problem but still unsure if my solution is acceptable.

The ns-owner-access-istio policy (generated) looks like this:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
 annotations:
   role: admin
   user: admin@example.com
 creationTimestamp: "2024-05-08T08:33:38Z"
 generation: 2
 name: ns-owner-access-istio
 namespace: example-test
 ownerReferences:
 - apiVersion: kubeflow.org/v1
   blockOwnerDeletion: true
   controller: true
   kind: Profile
   name: example-test
   uid: 9e705b85-91d9-4c56-861a-a739dae2b1c2
 resourceVersion: "51547541"
 uid: 95451200-168b-49a9-abce-762ef747f7d2
spec:
 rules:
 - when:
   - key: request.headers[kubeflow-userid]
     values:
     - admin@example.com
 - when:
   - key: source.namespace
     values:
     - example-test
 - to:
   - operation:
       paths:
       - /healthz
       - /metrics
       - /wait-for-drain
 - from:
   - source:
       principals:
       - cluster.local/ns/kubeflow/sa/notebook-controller-service-account
   to:
   - operation:
       methods:
       - GET
       paths:
       - '*/api/kernels'

I suppose now the clusterrole edit policy for my user should make sure the fetching from s3 works.
This "user-user1-example-eu-clusterrole-edit" (generated) policy looks like this:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"security.istio.io/v1beta1","kind":"AuthorizationPolicy","metadata":{"annotations":{"role":"edit","user":"user1@example.eu"},"labels":{"app.kubernetes.io/instance":"dkf-core--deploykf-profiles-generator","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"deploykf-profiles-generator","helm.sh/chart":"deploykf-profiles-generator-1.0.0"},"name":"user-user1-example-eu-clusterrole-edit","namespace":"example-test"},"spec":{"rules":[{"from":[{"source":{"principals":["cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway","cluster.local/ns/kubeflow/sa/ml-pipeline-ui"]}}],"when":[{"key":"request.headers[kubeflow-userid]","values":["user1@example.eu"]}]}]}}
    role: edit
    user: user1@example.eu
  creationTimestamp: "2024-05-08T09:43:59Z"
  generation: 7
  labels:
    app.kubernetes.io/instance: dkf-core--deploykf-profiles-generator
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: deploykf-profiles-generator
    helm.sh/chart: deploykf-profiles-generator-1.0.0
  name: user-user1-example-eu-clusterrole-edit
  namespace: example-test
  resourceVersion: "51657299"
  uid: f0605b72-4e72-4766-b8b6-1de4193e8f2e
spec:
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/deploykf-istio-gateway/sa/deploykf-gateway
        - cluster.local/ns/kubeflow/sa/ml-pipeline-ui
    when:
    - key: request.headers[kubeflow-userid]
      values:
      - user1@example.eu

But somehow it seems too restrictive for the S3 fetching to work in our setup.
Still have no clue why as troubleshooting is hard.
It won't accept traffic from source principal "cluster.local/ns/kubeflow/sa/ml-pipeline-ui" while everything seems right there.

When I add an extra policy manually allowing traffic from ml-pipeline-ui (as restrictive as possible) my problem is solved:

## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
## WARNING DO NOT USE, MESSES WITH AUTH ##
## !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: test-work-around-policy
  namespace: example-test
spec:
  rules:
  - when:
    - key: request.headers[kubeflow-userid]
      values:
      - user1@example.eu
    to:
    - operation:
        methods: ["GET"]
        paths: ["/pipeline/artifacts/*"]
    from:
    - source:
        ipBlocks: ["100.96.0.0/16"]                 ## covers our cluster podCIDRs
  selector:
    matchLabels:
      app: ml-pipeline-ui-artifact

I understand this policy may not be restrictive enough for best practice production but for us it is good enough for now to proceed with testing. Hopefully next releases will magically fix the issue without having to use this work around.

@bobbeeke
Copy link
Author

bobbeeke commented May 15, 2024

@thesuperzapper

Some extra info that might be of interest if you still want to analyze this further.

If I enable istio debug logging om my ml-pipeline-ui-artifact pod and perform a request I noticed the 'x-forwarded-client-cert' header is missing. I suspect that without this header content istio is not able to validate from.source.namespaces and from.source.principals and thus denies the traffic (?).

$ istioctl proxy-config log deploy/ml-pipeline-ui-artifact --level "rbac:debug"

Logging 403 request:

2024-05-15T05:27:10.578913Z     debug   envoy rbac external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:117        checking request: requestedServerName: , sourceIP: 100.96.1.223:38960, directRemoteIP: 100.96.1.223:38960, remoteIP: 100.96.1.236:0,localAddress: 100.96.1.248:3000, ssl: none, headers: ':authority', 'ml-pipeline-ui-artifact.example-test:80'
':path', '/pipeline/artifacts/get?source=s3&bucket=my-bucket-example-eu&key=artifacts%2Fexample-test%2Ftesting-out-metrics-and-their-visualizations-znzvh%2F2024%2F04%2F26%2Ftesting-out-metrics-and-their-visualizations-znzvh-3230888702%2Fmlpipeline-ui-metadata.tgz'
':method', 'GET'
':scheme', 'https'
'x-b3-sampled', '0'
'x-b3-parentspanid', '776f919417aac8da'
'x-b3-spanid', 'b4649b6fc9bc73ae'
'x-b3-traceid', 'e875051608ea5013776f919417aac8da'
'x-envoy-attempt-count', '1'
'kubeflow-userid', 'user1@example.eu'
'x-auth-request-email', 'user1@example.eu'
'authorization', 'Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjYxYmM3ZTYzNTM1NDc3MGY4YzliMDIwMGI2ZmFkNGQzNDAwNGQ2MWEifQ.eyJpc3MiOiJodHRwczovL2t1YmVmbG93LnJldGNhZC5ldSytrXgiLCJzdWIiOiJDaEppYjJKaVpXVnJaVUJ5WlhSallXUXVaWFVTQld4dlkyRnMiLCJhdWQiOiJvYXV0aDItcHJveHkiLCJleHAiOjE3MTU3NTQzNjAsImlhdCI6MTcxNTc1MDc2MCwibm9uY2UiOiI2MkwxVmFlcEcxWWtaeGVZa1NjRWpIYXFLNmFJVWk2NnVqdFVwX25PRTRNIiwiYXRfaGFzaCI6IldmVDJKQUFDSm1pSTdzZEhpblhfWXciLCJlbWFpbCI6ImJvYmJlZWtlQHJldGNhZC5ldSIsImVtYWlsX3ZlcmlmaWVkIjp0cnVlLCJuYW1lIjoiYm9iYmVla2VAcmV0Y2FkLmV1In0.xSjtl-vjjNmsXcSKGjuQLFUFACwOsJiO_muMcJgNjfwQKR3kwVtH7JJBTuXPCrDhRUjx3ip1E8ibi3YcU0edTMOmgcvQMEXEv4LCqAZbNgzhODlwqC8Ej0VXtH3npe_uUegaOVt7R0S098J5kR7_UDDvwI8y5ZnJ2QTGgySZHS248Q9fsKGW6SsP8bKQ01CuaxzqB4b6AxVx3lNrL2jcWIMLL7jR7hHmeFy7PYKQ6vevwKNpviOGXJT_rZpbQrB8-qn9BnruN_-vScj7_jWlOS2CDgfEudDKzcNqjS_lMLUshlfZvXohf_AVeWFzOjr86AUKninSEpYAMp-BIHll8Q'
'x-request-id', '636c19b0-7af1-46b6-8605-adcf7b0f1a40'
'x-envoy-external-address', '100.96.1.236'
'x-forwarded-proto', 'https'
'x-forwarded-for', '100.96.1.236'
'cookie', '__Secure-_deploykf_token=0aROKWMn6svHr4kjBaxlgCiruc5oilCvlCBCJ7viokv0yqgXYkRXqmsaVrn67aD5JfvEQkJ89VZoJsYAFt39sWnuEA1sI5XeqWa14HasY2UXYqbYCEHNRz8tK4V09HA0OAOXU8PK8SBbVA1Q2gFTTfj_Hf9xTr_wVcEshbx3cxAp5z7JSOexMe-70wJQtPT0XwEOljJjwDDmL7t6Jq8_9R051ikQE6pjeqQrKeg-Eszj19wx9AGnLZBwUoW3lO16vPFnslpDgHz-lGXW4jgFI0_cBmjjhw7HLOgJQxAzf3Mk1S6ecICSWulx-pKu5eYZ7yggghjoklz3DibAYKVBaVWoyKGFP5eP_j_gmZ3b3NOw0NFknLJOsCGe5uWwcOCUVUoFcyvG13WBu_55618EYZUzkH48JYXWqHNFqW7-V-0EyS1SDCUdEg81ktXgy7gP0Oj3y8SnnKKj487RRHSU2JO0ajDKYXInBI2BgCWbShUN8V0FPYPNevmjaGiI46QaNQhXFZKlzwrekJr11te80StQU_0OyxFWU3BikznIBiMatM_4On9eg9S0I0fyviDpMOgOdtkV7ND2pRXZ-LPMVj3DBFs1eHDDVG2zlAF1PYhZspO8JUn9RCExgrmYC0j077t0DZygNiFbw3Dew0OQEGLuGEPy2L7cUM57xsKjdMdSn9iJwYvHuvLQU3s4cVCYjzNidht9TnjjaQMWfGGLXWTtnXB5SAy4p4qhDZmU96kBwTeAHfp2fWm0ZnpDwXqeAK8s8W69O9OPLrTZqwWVG_NE2HbZ075Q6Uc21_PXm0t9cuaaPLwTZM5O3SfSJR5nKCjN4j82sqgo8EM4V60qr1fKEWntMZKrYOW9pgdRLbMtF56mmnShUN-R6Nkg_FUwZa5RrTszKvmJ2w17_ubmBrP-OwUtZKjtiUmggy9esfIMNFzzdIZDkcHWQVEWXn_WarD3B97_hXXqV9zoTt0Y4KoJpjhpk3D7TF4d5-_KgkJYxKJrnmPNEKIN2cR19pi2dJNcdhxgiLgt1lSuEedYblO2IdJBmIXQS7OVz1ZLkoScr77QrUutyzJ8g40LjTHlvD60l3rmEmZd8NArmVCof_EmziinBNnPajrOkl_8Es8UIUGdme0JQi8Y-vk7N4e1t_mhJWuUpZFMznhHYN9VMuOmyi0tW5PfLunHqhPTsZnCupYJBFuZ-Jbtf2n8a6tTw-ifkjJSP0Jl9xZjPvourneD2wIccPTSS9OnQ8lCoEfwXdx5PUMAyF79pXp5OGLxlfMFCHRJhXL9OB8LUbKtGdflPbVJOaPR44LFClLjsIsJj3HAK5bjG6S4ifOkagzSLiDHvuKAChs4Icrb_Z9n1fzqtrecP6Wz5LeuIZovJgjy51PxVB9RNnnYWvh5dG6oPvE6ifkdOvvPR0yt5ESFdXKzOE8k6KBOebd9uOQQgY-xbiawRa0rOf-uUjPsLWGoUC-IpE8H5qdG_M5K9m5SXKS4zJxMHe3iWr9zpMlgOftQ02MavvqjuX6hEFAj2k26AZ1MufhyB_qYLLZrSRJPgHEEkpbrNYiqf8rd9oqKxtI-S9OCZRjAoB2amOPpATV-e2I5eBOI09LjHcr-SGPyuCw4_bz506f8_wj2igfDbwIAB7bzSdm_TTq5vjX-BKRR4FCiuqfddzBboEbhOU-CoaFNXXilRvZpYVcHJz3_AgJiWu01AQ2iK1w-sHJ4Jh7389P7woer6OB2UW9zotEGExCmDUmXxmCtyEXJqSzK-sXdULlckiS--6O8EtfP5HhwdEDGwGKltKUR3yHXWjcB7Cq97S-vSbwbzPapWIEdfXhkOsXjm4b-Cl-ZnfGEtgJGLtA2xq9H88H-9iAmObYmFYUmM-1gv_vEA3R4J6OCv1D16P7JMRF-SCcdZfhcBzi8WhjH5DbOkMsoMmxTCqgTROKcKnLVn61tIQ==|1715750760|aeku5o-L3BqStO6KvCfSza-fanAcKvccNQyUH1vrnTw='
'te', 'trailers'
'sec-gpc', '1'
'sec-fetch-site', 'same-origin'
'sec-fetch-mode', 'cors'
'sec-fetch-dest', 'empty'
'dnt', '1'
'referer', 'https://kubeflow.example.eu/pipeline/'
'accept-encoding', 'gzip, deflate, br'
'accept-language', 'en-US,en;q=0.5'
'accept', '*/*'
'user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:125.0) Gecko/20100101 Firefox/125.0'
, dynamicMetadata:      thread=30
2024-05-15T05:27:10.578971Z     debug   envoy rbac external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:157        enforced allowed, matched policy none   thread=30
2024-05-15T05:27:10.578993Z     debug   envoy rbac external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:117        checking request: requestedServerName: , sourceIP: 100.96.1.223:38960, directRemoteIP: 100.96.1.223:38960, remoteIP: 100.96.1.236:0,localAddress: 100.96.1.248:3000, ssl: none, headers: ':authority', 'ml-pipeline-ui-artifact.example-test:80'
':path', '/pipeline/artifacts/get?source=s3&bucket=my-bucket-example-eu&key=artifacts%2Fexample-test%2Ftesting-out-metrics-and-their-visualizations-znzvh%2F2024%2F04%2F26%2Ftesting-out-metrics-and-their-visualizations-znzvh-3230888702%2Fmlpipeline-ui-metadata.tgz'
':method', 'GET'
':scheme', 'https'
'x-b3-sampled', '0'
'x-b3-parentspanid', '776f919417aac8da'
'x-b3-spanid', 'b4649b6fc9bc73ae'
'x-b3-traceid', 'e875051608ea5013776f919417aac8da'
'x-envoy-attempt-count', '1'
'kubeflow-userid', 'user1@example.eu'
'x-auth-request-email', 'user1@example.eu'
'authorization', 'Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjYxYmM3ZTYzNTM1NDc3MGY4YzliMDIwMGI2ZmFkNGQzNDAwNGQ2MWEifQ.eyJpc3MiOiJodHRwczovL2t1YmVmbG93LnJldGNhZC5ldS9kZXgiLCJzdWIiOiJDaEppYjJKaVpXVnJaVUJ5WlhSallXUXVaWFVTQld4dlkyRnMiLCJhdWQiOiJvYXV0aDItcHJveHkidfeleHAiOjE3MTU3NTQzNjfgjthedCI6MTcxNTc1MDc2MCwibm9uY2UiOiI2MkwxVmFlcEcxWWtaeGVZa1NjRWpIYXFLNmFJVWk2NnVqdFVwX25PRTRNIiwiYXRfaGFzaCI6IldmVDJKQUFDSm1pSTdzZEhpblhfWXciLCJlbWFpbCI6ImJvYmJlZWtlQHJldGNhZC5ldSIsImVtYWlsX3ZlcmlmaWVkIjp0cnVlLCJuYW1lIjoiYm9iYmVla2VAcmV0Y2FkLmV1In0.xSjtl-vjjNmsXcSKGjuQLFUFACwOsJiO_muMcJgNjfwQKR3kwVtH7JJBTuXPCrDhRUjx3ip1E8ibi3YcU0edTMOmgcvQMEXEv4LCqAZbNgzhODlwqC8Ej0VXtH3npe_uUegaOVt7R0S098J5kR7_UDDvwI8y5ZnJ2QTGgySZHS248Q9fsKGW6SsP8bKQ01CuaxzqB4b6AxVx3lNrL2jcWIMLL7jR7hHmeFy7PYKQ6vevwKNpviOGXJT_rZpbQrB8-qn9BnruN_-vScj7_jWlOS2CDgfEudDKzcNqjS_lMLUshlfZvXohf_AVeWFzOjr86AUKninSEpYAMp-BIHll8Q'
'x-request-id', '636c19b0-7af1-46b6-8605-adcf7b0f1a40'
'x-envoy-external-address', '100.96.1.236'
'x-forwarded-proto', 'https'
'x-forwarded-for', '100.96.1.236'
'cookie', '__Secure-_deploykf_token=0aROKWMn6svHr4kjBaxlgCiruc5oilCvlCBCJ7viokv0yqgXYkRXqmsaVrn67aD5JfvEQkJ89VZoJsYAFt39sWnuEA1sI5XeqWa14HasY2UXYqbYCEHNRz8tK4V09HA0fdfrU8PK8SBbVA1Q2gFTTfj_Hf9xTr_wVcEshbx3cxAp5z7JSOexMe-70wJQtPT0XwEOljJjwDDmL7t6Jq8_9R051ikQE6pjeqQrKeg-Eszj19wx9AGnLZBwUoW3lO16vPFnslpDgHz-lGXW4jgFI0_cBmjjhw7HLOgJQxAzf3Mk1S6ecICSWulx-pKu5eYZ7yggghjoklz3DibAYKVBaVWoyKGFP5eP_j_gmZ3b3NOw0NFknLJOsCGe5uWwcOCUVUoFcyvG13WBu_55618EYZUzkH48JYXWqHNFqW7-V-0EyS1SDCUdEg81ktXgy7gP0Oj3y8SnnKKj487RRHSU2JO0ajDKYXInBI2BgCWbShUN8V0FPYPNevmjaGiI46QaNQhXFZKlzwrekJr11te80StQU_0OyxFWU3BikznIBiMatM_4On9eg9S0I0fyviDpMOgOdtkV7ND2pRXZ-LPMVj3DBFs1eHDDVG2zlAF1PYhZspO8JUn9RCExgrmYC0j077t0DZygNiFbw3Dew0OQEGLuGEPy2L7cUM57xsKjdMdSn9iJwYvHuvLQU3s4cVCYjzNidht9TnjjaQMWfGGLXWTtnXB5SAy4p4qhDZmU96kBwTeAHfp2fWm0ZnpDwXqeAK8s8W69O9OPLrTZqwWVG_NE2HbZ075Q6Uc21_PXm0t9cuaaPLwTZM5O3SfSJR5nKCjN4j82sqgo8EM4V60qr1fKEWntMZKrYOW9pgdRLbMtF56mmnShUN-R6Nkg_FUwZa5RrTszKvmJ2w17_ubmBrP-OwUtZKjtiUmggy9esfIMNFzzdIZDkcHWQVEWXn_WarD3B97_hXXqV9zoTt0Y4KoJpjhpk3D7TF4d5-_KgkJYxKJrnmPNEKIN2cR19pi2dJNcdhxgiLgt1lSuEedYblO2IdJBmIXQS7OVz1ZLkoScr77QrUutyzJ8g40LjTHlvD60l3rmEmZd8NArmVCof_EmziinBNnPajrOkl_8Es8UIUGdme0JQi8Y-vk7N4e1t_mhJWuUpZFMznhHYN9VMuOmyi0tW5PfLunHqhPTsZnCupYJBFuZ-Jbtf2n8a6tTw-ifkjJSP0Jl9xZjPvourneD2wIccPTSS9OnQ8lCoEfwXdx5PUMAyF79pXp5OGLxlfMFCHRJhXL9OB8LUbKtGdflPbVJOaPR44LFClLjsIsJj3HAK5bjG6S4ifOkagzSLiDHvuKAChs4Icrb_Z9n1fzq15WcP6Wz5LeuIZovJgjy51PxVB9RNnnYWvh5dG6oPvE6ifkdOvvPR0yt5ESFdXKzOE8k6KBOebd9uOQQgY-xbiawRa0rOf-uUjPsLWGoUC-IpE8H5qdG_M5K9m5SXKS4zJxMHe3iWr9zpMlgOftQ02MavvqjuX6hEFAj2k26AZ1MufhyB_qYLLZrSRJPgHEEkpbrNYiqf8rd9oqKxtI-S9OCZRjAoB2amOPpATV-e2I5eBOI09LjHcr-SGPyuCw4_bz506f8_wj2igfDbwIAB7bzSdm_TTq5vjX-BKRR4FCiuqfddzBboEbhOU-CoaFNXXilRvZpYVcHJz3_AgJiWu01AQ2iK1w-sHJ4Jh7389P7woer6OB2UW9zotEGExCmDUmXxmCtyEXJqSzK-sXdULlckiS--6O8EtfP5HhwdEDGwGKltKUR3yHXWjcB7Cq97S-vSbwbzPapWIEdfXhkOsXjm4b-Cl-ZnfGEtgJGLtA2xq9H88H-9iAmObYmFYUmM-1gv_vEA3R4J6OCv1D16P7JMRF-SCcdZfhcBzi8WhjH5DbOkMsoMmxTCqgTROKcKnLVn61tIQ==|1715750760|aeku5o-L3BqStO6KvCfSza-fanAcKvccNQyUH1vrnTw='
'te', 'trailers'
'sec-gpc', '1'
'sec-fetch-site', 'same-origin'
'sec-fetch-mode', 'cors'
'sec-fetch-dest', 'empty'
'dnt', '1'
'referer', 'https://kubeflow.retcad.eu/pipeline/'
'accept-encoding', 'gzip, deflate, br'
'accept-language', 'en-US,en;q=0.5'
'accept', '*/*'
'user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:125.0) Gecko/20100101 Firefox/125.0'
, dynamicMetadata:      thread=30
2024-05-15T05:27:10.579012Z     debug   envoy rbac external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:161        enforced denied, matched policy none    thread=30
[2024-05-15T05:27:10.578Z] "GET /pipeline/artifacts/get?source=s3&bucket=my-bucket-example-eu&key=artifacts%2Fretcad-test%2Ftesting-out-metrics-and-their-visualizations-znzvh%2F2024%2F04%2F26%2Ftesting-out-metrics-and-their-visualizations-znzvh-3230888702%2Fmlpipeline-ui-metadata.tgz HTTP/1.1" 403 - rbac_access_denied_matched_policy[none] - "-" 0 19 1 - "100.96.1.236" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:125.0) Gecko/20100101 Firefox/125.0" "636c19b0-7af1-46b6-8605-adcf7b0f1a40" "ml-pipeline-ui-artifact.retcad-test:80" "-" inbound|3000|| - 100.96.1.248:3000 100.96.1.236:0 - default

@thesuperzapper
Copy link
Member

@bobbeeke did you manage to figure this out?

If so, could you share how you did?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug kind - things not working properly priority/needs-triage priority - needs to be triaged
Projects
None yet
Development

No branches or pull requests

2 participants