Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with file locks properly in a distributed environment #190

Open
bennahugo opened this issue Mar 24, 2022 · 0 comments
Open

How to deal with file locks properly in a distributed environment #190

bennahugo opened this issue Mar 24, 2022 · 0 comments

Comments

@bennahugo
Copy link
Collaborator

bennahugo commented Mar 24, 2022

Just plotting down ideas for discussion for now:

From what I can tell in my browse of this there are the following existing issues:

  • In the distributed case casacore::tables rely on OS-supported flocks (sysctl) which are not guaranteed to be safe to use in shared storage between machines (there is no awareness of node IP or other identifying criteria. Looking at this: https://github.com/ratt-ru/dask-ms/blob/master/daskms/table_executor.py#L38-L54 if multiple dask processes each backed with a threadpool and queue pointing to a database on a shared filesystem are started on multiple nodes there is no guarantee that the flocks will hold.
  • One possibility (since dask is being used here) is to implement something like: https://github.com/pydata/xarray/blob/main/xarray/backends/locks.py with spinning locks to block until the lock becomes available as a wrapper around the entire table system to make it distribution-safe.
  • The same "user-style" read and write locking will need to be applied for xarray-backed datasets as far as I can tell via context management, although I'm not sure how finely the arrays are "bucketed" in the array specification for this to work.
@bennahugo bennahugo changed the title How to deal with locks propper in a distributed environment How to deal with file locks propper in a distributed environment Mar 24, 2022
@JSKenyon JSKenyon changed the title How to deal with file locks propper in a distributed environment How to deal with file locks properly in a distributed environment Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant