Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Write #738

Open
cazter opened this issue Feb 7, 2024 · 4 comments
Open

Unable to Write #738

cazter opened this issue Feb 7, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@cazter
Copy link

cazter commented Feb 7, 2024

Mountpoint for Amazon S3 version

v1.2.0-eksbuild.1

AWS Region

us-east-1

Describe the running environment

Unable to write. Currently testing with AWS IAM role that has all s3 action permissions on the bucket being used by the EKS Mountpoint S3 addon.

/datas3_us/live/pg-manager/pg_wal/spilo/****-*****10140$ tar -czvf archive_name.tar.gz 13edd11c-7e37-4b11-b54d-c8308013957d/
13edd11c-7e37-4b11-b54d-c8308013957d/
13edd11c-7e37-4b11-b54d-c8308013957d/wal/
13edd11c-7e37-4b11-b54d-c8308013957d/wal/11/
13edd11c-7e37-4b11-b54d-c8308013957d/wal/11/basebackups_005/
13edd11c-7e37-4b11-b54d-c8308013957d/wal/11/basebackups_005/base_00000001000000000000000D_00000040/
13edd11c-7e37-4b11-b54d-c8308013957d/wal/11/basebackups_005/base_00000001000000000000000D_00000040/extended_version.txt
13edd11c-7e37-4b11-b54d-c8308013957d/wal/11/basebackups_005/base_00000001000000000000000D_00000040/tar_partitions/
13edd11c-7e37-4b11-b54d-c8308013957d/wal/11/basebackups_005/base_00000001000000000000000D_00000040/tar_partitions/part_00000000.tar.lzo

gzip: stdout: Input/output error
tar: archive_name.tar.gz: Cannot write: Broken pipe
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Mountpoint options

apiVersion: v1
kind: PersistentVolume
metadata:
  name: s3-eks-us-pv
  namespace: ****
spec:
  capacity:
    storage: 1200Gi
  accessModes:
    - ReadWriteMany
  mountOptions:
    - allow-delete
    - allow-other
    - region us-east-1
    - uid=1000
    - gid=1000
  csi:
    driver: s3.csi.aws.com
    volumeHandle: eks-logging-****-volume
    volumeAttributes:
      bucketName: eks-logging-****
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: s3-eks-us-pvc
  namespace: ****
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  resources:
    requests:
      storage: 1200Gi
  volumeName: s3-eks-us-pv

What happened?

Zip a large directory recursively--all within an S3 bucket.

Relevant log output

No response

@cazter cazter added the bug Something isn't working label Feb 7, 2024
@sauraank
Copy link
Contributor

sauraank commented Feb 8, 2024

Hey! I tried to zip a directory recursively using the same command as yours and I was able to do it successfully on mounted bucket on my instance.
Can you share the logs (with --debug flag enabled) when your zip fails? You can follow our Logging page to know about the options.
Also, what is the size of the directory that you are trying to zip?

@cazter
Copy link
Author

cazter commented Feb 8, 2024

Hey! I tried to zip a directory recursively using the same command as yours and I was able to do it successfully on mounted bucket on my instance. Can you share the logs (with --debug flag enabled) when your zip fails? You can follow our Logging page to know about the options. Also, what is the size of the directory that you are trying to zip?

I have tried creating several tarballs in different configurations including one similar to the one you have here and am not able to reproduce this. My first thought was tar is doing some filesystem operations that mountpoint does not support (this doc has more details), but given I can't reproduce it there might be something else going on.

Some more information that would be helpful here:

  • Relevant logs from the driver container (kubectl logs -l app=s3-csi-node --namespace kube-system)
  • Relevant logs from mountpoint (from the underlying host's syslog journalctl -e SYSLOG_IDENTIFIER=mount-s3 and mountpoint has some additional documentation here)
  • What OS and major version your nodes are running?
  • Do all subsequent reads and writes fail after you see this error?

I revised the PV yaml to include - debug.

Read works and continues to work. Writes of any kind don't work--in fact, they may have never worked. I originally tested with touch test, which seems to work but in fact no file is created.

cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"
kubectl logs -l app=s3-csi-node --namespace kube-system -c s3-plugin
I0202 02:13:09.516061       1 node.go:204] NodeGetInfo: called with args
I0202 02:12:38.716005       1 driver.go:60] Driver version: 1.2.0, Git commit: 8a832dc5e2fcaa01c02bece33c09517b5364687a, build date: 2024-01-17T16:52:48Z, nodeID: ip-10-0-1-10.ec2.internal, mount-s3 version: 1.3.2
I0202 02:12:38.719185       1 mount_linux.go:285] 'umount /tmp/kubelet-detect-safe-umount733122656' failed with: exit status 32, output: umount: /tmp/kubelet-detect-safe-umount733122656: must be superuser to unmount.
I0202 02:12:38.719234       1 mount_linux.go:287] Detected umount with unsafe 'not mounted' behavior
I0202 02:12:38.719315       1 driver.go:80] Found AWS_WEB_IDENTITY_TOKEN_FILE, syncing token
I0202 02:12:38.719601       1 driver.go:110] Listening for connections on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
kubectl logs -l app=s3-csi-node --namespace kube-system -c node-driver-registrar
I0202 02:11:48.769131       1 driver.go:60] Driver version: 1.2.0, Git commit: 8a832dc5e2fcaa01c02bece33c09517b5364687a, build date: 2024-01-17T16:52:48Z, nodeID: ip-10-0-21-67.ec2.internal, mount-s3 version: 1.3.2
I0202 02:11:48.772261       1 mount_linux.go:285] 'umount /tmp/kubelet-detect-safe-umount386038248' failed with: exit status 32, output: umount: /tmp/kubelet-detect-safe-umount386038248: must be superuser to unmount.
I0202 02:11:48.772276       1 mount_linux.go:287] Detected umount with unsafe 'not mounted' behavior
I0202 02:11:48.772323       1 driver.go:80] Found AWS_WEB_IDENTITY_TOKEN_FILE, syncing token
I0202 02:11:48.772527       1 driver.go:110] Listening for connections on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
I0202 02:11:49.544914       1 node.go:204] NodeGetInfo: called with args
journalctl -e SYSLOG_IDENTIFIER=mount-s3
Feb 08 18:14:40 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [DEBUG] log: FUSE(132) ino 0x000000000000000a RELEASEDIR fh FileHandle(4), flags 0x28800, flush false, lock owner None
Feb 08 18:14:40 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [DEBUG] log: FUSE(138) ino 0x0000000000000007 RELEASEDIR fh FileHandle(1), flags 0x28800, flush false, lock owner None
Feb 08 18:14:40 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [DEBUG] log: FUSE(136) ino 0x0000000000000008 RELEASEDIR fh FileHandle(2), flags 0x28800, flush false, lock owner None
Feb 08 18:14:40 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [DEBUG] mountpoint_s3::fuse::session: starting fuse worker 4 (thread id 3174561)
Feb 08 18:14:40 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [DEBUG] read{req=112 ino=29 fh=9 offset=507904 size=131072 name=part_00000000.tar.lzo}:prefetch{range=1179648..8388608 out of 19317583}:get_object{id=54 bucket=eks-************ key=live/pg-manager/pg_wal/spilo/****-************/13edd11
Feb 08 18:14:40 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [DEBUG] read{req=112 ino=29 fh=9 offset=507904 size=131072 name=part_00000000.tar.lzo}:prefetch{range=1179648..8388608 out of 19317583}:get_object{id=54 bucket=eks-************ key=live/pg-manager/pg_wal/spilo/****-************/13edd11
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.io_size[type=read]: n=8: min=131 p10=131 p50=66047 avg=80200.38 p90=132095 p99=132095 p99.9=132095 max=132095
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_failures[op=create]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_failures[op=flush]: 2 (n=2)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_failures[op=getxattr]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_failures[op=ioctl]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_failures[op=lookup]: 2 (n=2)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_failures[op=write]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=create]: n=1: min=20 p10=20 p50=20 avg=20.00 p90=20 p99=20 p99.9=20 max=20
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=flush]: n=5: min=10 p10=10 p50=14 avg=21.40 p90=36 p99=36 p99.9=36 max=36
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=getattr]: n=2: min=16 p10=16 p50=16 avg=30280.00 p90=60671 p99=60671 p99.9=60671 max=60671
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=getxattr]: n=1: min=19 p10=19 p50=19 avg=19.00 p90=19 p99=19 p99.9=19 max=19
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=ioctl]: n=1: min=25 p10=25 p50=25 avg=25.00 p90=25 p99=25 p99.9=25 max=25
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=lookup]: n=8: min=32384 p10=32511 p50=46847 avg=53784.00 p90=102911 p99=102911 p99.9=102911 max=102911
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=mknod]: n=1: min=86016 p10=86527 p50=86527 avg=86272.00 p90=86527 p99=86527 p99.9=86527 max=86527
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=open]: n=3: min=51200 p10=51455 p50=67071 avg=64554.67 p90=75775 p99=75775 p99.9=75775 max=75775
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=opendir]: n=6: min=9 p10=9 p50=14 avg=61.00 p90=301 p99=301 p99.9=301 max=301
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=read]: n=8: min=41 p10=41 p50=169 avg=68364.88 p90=344063 p99=344063 p99.9=344063 max=344063
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=readdirplus]: n=12: min=12 p10=15 p50=18 avg=12546.50 p90=21631 p99=53247 p99.9=53247 max=53247
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=release]: n=3: min=14 p10=14 p50=20 avg=60.00 p90=146 p99=146 p99.9=146 max=146
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=releasedir]: n=6: min=4 p10=4 p50=6 avg=8.50 p90=16 p99=16 p99.9=16 max=16
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_latency_us[op=write]: n=1: min=71 p10=71 p50=71 avg=71.00 p90=71 p99=71 p99.9=71 max=71
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_unimplemented[op=create]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_unimplemented[op=getxattr]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.op_unimplemented[op=ioctl]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.readdirplus.entries: 34 (n=12)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: fuse.total_bytes[type=read]: 639107 (n=8)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: prefetch.contiguous_read_len: n=2: min=131 p10=131 p50=131 avg=320577.50 p90=643071 p99=643071 p99.9=643071 max=643071
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: prefetch.part_queue_starved_us: n=2: min=20224 p10=20351 p50=20351 avg=101536.00 p90=183295 p99=183295 p99.9=183295 max=183295
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_auto_default_network_io: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_auto_ranged_copy_network_io: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_auto_ranged_get_network_io: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_auto_ranged_put_network_io: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_requests_being_prepared: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_requests_being_processed: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_requests_stream_queued_waiting: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_requests_streaming_response: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.num_total_network_io: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.client.request_queue_size: 0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.failures[op=head_object,status=404]: 11 (n=11)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.failures[op=put_object,status=400]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.first_byte_latency_us[op=get_object]: n=3: min=17920 p10=18047 p50=181247 avg=178538.67 p90=337919 p99=337919 p99.9=337919 max=337919
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.first_byte_latency_us[op=head_object]: n=13: min=7808 p10=8191 p50=12223 avg=24598.15 p90=59135 p99=101887 p99.9=101887 max=101887
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.first_byte_latency_us[op=list_objects]: n=19: min=14400 p10=15743 p50=42751 avg=39814.74 p90=63231 p99=83455 p99.9=83455 max=83455
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.first_byte_latency_us[op=put_object]: n=1: min=56064 p10=56319 p50=56319 avg=56192.00 p90=56319 p99=56319 p99.9=56319 max=56319
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.throughput_mibs[op=get_object,size=1-16MiB]: n=2: min=6 p10=6 p50=6 avg=13.00 p90=20 p99=20 p99.9=20 max=20
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.throughput_mibs[op=get_object,size=<1MiB]: n=1: min=0 p10=0 p50=0 avg=0.00 p90=0 p99=0 p99.9=0 max=0
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.total_latency_us[op=get_object]: n=3: min=18048 p10=18175 p50=181247 avg=179946.67 p90=342015 p99=342015 p99.9=342015 max=342015
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.total_latency_us[op=head_object]: n=13: min=7808 p10=8191 p50=12223 avg=24598.15 p90=59135 p99=101887 p99.9=101887 max=101887
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.total_latency_us[op=list_objects]: n=19: min=14464 p10=15807 p50=43007 avg=39902.32 p90=63231 p99=83455 p99.9=83455 max=83455
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests.total_latency_us[op=put_object]: n=1: min=56064 p10=56319 p50=56319 avg=56192.00 p90=56319 p99=56319 p99.9=56319 max=56319
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests[op=get_object]: 3 (n=3)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests[op=head_object]: 13 (n=13)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests[op=list_objects]: 19 (n=19)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.meta_requests[op=put_object]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.failures[op=head_object,type=Default,status=404]: 11 (n=11)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.failures[op=put_object,type=CreateMultipartUpload,status=400]: 1
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.first_byte_latency_us[op=get_object,type=Default]: n=3: min=17792 p10=17919 p50=106495 avg=92736.00 p90=154623 p99=154623 p99.9=154623 max=154623
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.first_byte_latency_us[op=head_object,type=Default]: n=13: min=7488 p10=7871 p50=11967 avg=23357.54 p90=53759 p99=97791 p99.9=97791 max=97791
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.first_byte_latency_us[op=list_objects,type=Default]: n=19: min=13248 p10=14591 p50=37631 avg=35516.63 p90=56575 p99=79359 p99.9=79359 max=79359
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.first_byte_latency_us[op=put_object,type=CreateMultipartUpload]: n=1: min=55808 p10=56063 p50=56063 avg=55936.00 p90=56063 p99=56063 p99.9=56063 max=56063
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.total_latency_us[op=get_object,type=Default]: n=3: min=17920 p10=18047 p50=181247 avg=179904.00 p90=342015 p99=342015 p99.9=342015 max=342015
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.total_latency_us[op=head_object,type=Default]: n=13: min=7680 p10=8063 p50=12095 avg=24450.46 p90=58879 p99=101375 p99.9=101375 max=101375
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.total_latency_us[op=list_objects,type=Default]: n=19: min=14400 p10=15743 p50=42751 avg=39760.84 p90=63231 p99=82943 p99.9=82943 max=82943
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests.total_latency_us[op=put_object,type=CreateMultipartUpload]: n=1: min=56064 p10=56319 p50=56319 avg=56192.00 p90=56319 p99=56319 p99.9=56319 max=56319
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests[op=get_object,type=Default]: 3 (n=3)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests[op=head_object,type=Default]: 13 (n=13)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests[op=list_objects,type=Default]: 19 (n=19)
Feb 08 18:14:42 ip-10-0-52-212.ec2.internal mount-s3[3170859]: [INFO] mountpoint_s3::metrics: s3.requests[op=put_object,type=CreateMultipartUpload]: 1

@cazter
Copy link
Author

cazter commented Feb 8, 2024

From the role being used by the addon driver within EKS, here is the json from the policy applied to the role (full permissions while troubleshooting).

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "MountpointFullBucketAccess",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::*",
                "arn:aws:s3:::*/*"
            ]
        },
        {
            "Sid": "MountpointFullObjectAccess",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::*",
                "arn:aws:s3:::*/*"
            ]
        }
    ]
}

Trust policy, with some masking added.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::*********:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/81E864EC5D407DE3A80E**************"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "oidc.eks.us-east-1.amazonaws.com/id/81E864EC5D407DE3A80E**************:aud": "sts.amazonaws.com",
                    "oidc.eks.us-east-1.amazonaws.com/id/81E864EC5D407DE3A80E**************:sub": "system:serviceaccount:kube-system:s3-csi-*"
                }
            }
        }
    ]
}

@dannycjones
Copy link
Contributor

dannycjones commented Feb 12, 2024

Thanks for collecting this information, Brent!

I can see a bit of information from those logs - notably that a write and two flush operations did fail.

To grab a bit more information, I'll message on the CSI Driver issue (awslabs/mountpoint-s3-csi-driver#142 (comment)) for now since getting these logs out is useful information for that repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants