Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read-write Dataset gives Socket not connected errors after a cp -r #322

Open
starpit opened this issue Nov 30, 2023 · 4 comments
Open

read-write Dataset gives Socket not connected errors after a cp -r #322

starpit opened this issue Nov 30, 2023 · 4 comments
Assignees

Comments

@starpit
Copy link

starpit commented Nov 30, 2023

i am seeing issues when i try to mount a IBM Cloud COS bucket read-write. i see this error from pods that try to mount read-write Datasets:

Socket not connected

is it possible that csi-s3 only supports ReadWriteOnce? I see that datashim sets the PVC to ReadWriteMany, which seems not to be supported by csi-s3?

@starpit
Copy link
Author

starpit commented Dec 1, 2023

Update

now i'm less certain as to what is going on. (also updated issue title)

it turns out that the mount is fine... until my application does a cp -r src dst where both src and dst are on the mount in question. after that, subsequent reads result in Socket not connected.

@starpit starpit changed the title read-write Dataset gives Socket not connected errors (csi-s3 versus ReadWriteMany) read-write Dataset gives Socket not connected errors after a cp -r Dec 1, 2023
@starpit
Copy link
Author

starpit commented Dec 1, 2023

Update 2

yes, if i update the code to avoid the s3-to-s3 copy, it works fine. how do we get to the root cause of this bug? i spent hours scouring the csi, kubelet, etc. logs, and there was nothing. not a single sign of why the mount disconnects after an s3-to-s3 copy.

@srikumar003
Copy link
Collaborator

@starpit we are looking into this issue but RWX has not (yet) caused an issue for csi-s3. We are duplicating the s3-s3 copy through the mount-points and be back with at least an understanding of what's happening

@srikumar003
Copy link
Collaborator

@starpit I was unable to replicate the issue either on kind or in an OpenShift cluster with small files (10s of KB) or comparatively larger files (approx 6 MBs). I can try again when I have more information about the context in which you got these errors

@srikumar003 srikumar003 self-assigned this Dec 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants