Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3fs fuse hangs EC2 node very frequently #2422

Open
uareddy opened this issue Feb 25, 2024 · 1 comment
Open

s3fs fuse hangs EC2 node very frequently #2422

uareddy opened this issue Feb 25, 2024 · 1 comment

Comments

@uareddy
Copy link

uareddy commented Feb 25, 2024

Hi
I am using s3fs as a mount in AWS ec2 . we are using the EC2 instance as sftp server .
We receive daily 400+ files , The max concurrent files is around 100.
The file sizes are from 10GB to 300GB all files in binary format .
I am using s3fs version 1.93

The munt command we are using :

sudo s3fs $sftp_bucket_name:/incoming/ /var/sftp/incoming/ -o allow_other -o curldbg -o max_background=1000 -o max_stat_cache_size=100000 -o stat_cache_expire=900 -o multipart_size=512 -o parallel_count=30 -o multireq_max=30 -o dbglevel=info -o complement_stat -o compat_dir -o readwrite_timeout=900 -o connect_timeout=900 -o stat_cache_interval_expire=600 -o ensure_diskfree=512 -o nonempty -o endpoint=eu-west-2 -o use_sse=kmsid:$kms_key -o iam_role=$sftp_iam_role -o umask=0007,uid=$user_id

we increased the /tmp size to 100GB.

The same configure works perfectly for some days and suddenly in one day the it stops and the EC2 nodes also hangs .
The log we observed at the time is :

Feb 24 05:37:46 ip-100-96-131-182 kernel: Call Trace:
Feb 24 05:37:46 ip-100-96-131-182 kernel: __schedule+0x28e/0x890
Feb 24 05:37:46 ip-100-96-131-182 kernel: schedule+0x28/0x80
Feb 24 05:37:46 ip-100-96-131-182 kernel: request_wait_answer+0x125/0x1f0 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: ? finish_wait+0x80/0x80
Feb 24 05:37:46 ip-100-96-131-182 kernel: __fuse_request_send+0x7f/0x90 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: fuse_simple_request+0xbd/0x190 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: fuse_do_getattr+0x106/0x310 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: vfs_statx+0x89/0xe0
Feb 24 05:37:46 ip-100-96-131-182 kernel: SYSC_newlstat+0x39/0x70
Feb 24 05:37:46 ip-100-96-131-182 kernel: do_syscall_64+0x67/0x110
Feb 24 05:37:46 ip-100-96-131-182 kernel: entry_SYSCALL_64_after_hwframe+0x5e/0xc3
Feb 24 05:37:46 ip-100-96-131-182 kernel: RIP: 0033:0x7f4ae0dcee05
Feb 24 05:37:46 ip-100-96-131-182 kernel: RSP: 002b:00007f4ab27f80f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
Feb 24 05:37:46 ip-100-96-131-182 kernel: RAX: ffffffffffffffda RBX: 00007f4ab27f9270 RCX: 00007f4ae0dcee05
Feb 24 05:37:46 ip-100-96-131-182 kernel: RDX: 00007f4ab27f8130 RSI: 00007f4ab27f8130 RDI: 00007f4ab27f8270
Feb 24 05:37:46 ip-100-96-131-182 kernel: RBP: 00007f4ab27f8200 R08: 00007f4a98096690 R09: 00007f4ab27f7f80
Feb 24 05:37:46 ip-100-96-131-182 kernel: R10: 00007f4ab27f80b0 R11: 0000000000000246 R12: 00007f4ab27f8270
Feb 24 05:37:46 ip-100-96-131-182 kernel: R13: 00007f4a9809669a R14: 00007f4a980966a2 R15: 00007f4ab27f8282
Feb 24 05:39:46 ip-100-96-131-182 kernel: INFO: task oneagentos:5895 blocked for more than 120 seconds.
Feb 24 05:39:46 ip-100-96-131-182 kernel: Not tainted 4.14.336-256.559.amzn2.x86_64 #1

we have another empty mount for /var/sftp/home where we did not place any objects

Is there anything you suggest in mount parameters to be changed ,
Please let me know if you need any further info

@ggtakec
Copy link
Member

ggtakec commented Apr 14, 2024

@uareddy I'm sorry for my late reply.

Is this problem still occurring?
if it hangs, you may not be able to get the s3fs log at that time.
(If we can see it, it will be helpful to solve this problem)

It is difficult to determine the cause, but is it possible that the cache directory(use_cache option) disk is full?
(If you can try, you may be able to avoid this by periodically deleting cache files in a separate process.)

Also, s3fs v1.94 has been released. could you try to use it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants