Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3proxy for production use cases using NFS as backend #516

Open
rajivml opened this issue Apr 29, 2023 · 2 comments
Open

s3proxy for production use cases using NFS as backend #516

rajivml opened this issue Apr 29, 2023 · 2 comments

Comments

@rajivml
Copy link

rajivml commented Apr 29, 2023

HI @gaul

Should we anticipate any issues using s3proxy with nfs as backend in production for light weight use cases (few hundred GB's) ? I have burned my hands with rook-ceph on kubernetes as it's quite complicated to operate, I mean it's very difficult to hide the complexity of rook-ceph in production especially when it's running on kubernetes

I have deployed s3proxy with nfs as backend and wrote few 100 GB's of data and I didn't faced any issue but just want to take some expert opinion here on what kind of issues/challenges that one should be aware of while using this combination

Thanks

@gaul
Copy link
Owner

gaul commented May 19, 2023

Generally this will work but you may encounter two performance problems:

  • Large number of objects: S3Proxy 2.0.0 enumerates the entire bucket underneath a given subdirectory. The upcoming 2.1.0 release will fix this by including JCLOUDS-1371 but you can compile from source for now. This will only enumerate the children of a subdirectory, not all its grandchildren, great-grandchildren, etc..
  • Writing large multi-part uploads: S3Proxy maintains a 1:1 correspondence between an object and a file. When uploading an MPU with many parts, it must join all these parts to create the final file. Thus S3Proxy will do 3x IOs: write all the parts, read everything back, and rewrite as one file. This can make the final CompleteMultiPartUpload time out for some clients.

@rajivml
Copy link
Author

rajivml commented May 21, 2023

Thanks a lot for your reply @gaul ,

Regarding #1, I can build it from source not a problem

Regarding #2, We don't have objects greater than 8GB in size and uploading of such objects is not that frequent . Would it still be a problem? at least in my load test I haven't noticed any problems

One other issue that's stopping me from considering it for production is encryption at rest, I trie transparent encryption feature but I find it buggy, for small uploads it's working fine but for files greater than 100mb, the uploads are always failing and if I disable transparent encryption it's absolutely fine

And another problem I noticed is failed multiparts aren't getting cleaned up automatically which will risk filling up NFS server. Off course this can be solved by writing a k8's cronjob to clean up the same, this is not a P0 but transparent encryption with large files upload/download is definitely a concern

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants