You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! Snakemake is quickly becoming one of my favourite tools and I was pleasantly surprised to find out there was support for iRODS. I've observed two issue however that I have debugged for myself and just wanted to share it here for the benefit of other users. I will format the two issues separately but they are related.
Snakemake version
7.3.0 (as far as I can tell these issue still exists on the main branch)
Issue 1. glob_wildcards requires anti-pattern to work properly
from snakemake.remote.iRODS import RemoteProvider
irods = RemoteProvider(irods_env_file='setup-data/irods_environment.json',
timezone="Europe/Berlin") # all parameters are optional
# please note the comma after the variable name!
# access: irods.remote(expand('home/rods/{f}), f=files))
files, = irods.glob_wildcards('home/rods/{files})
The last line causes an issue by throwing irods.exception.CollectionDoesNotExist. This issue can be traced to the fact that the iRODS "zone" is not being included when running glob_wildcards. When I manually add the zone as a prefix to the path, it runs fine. (i.e files, = irods.glob_wildcards('/zone/home/rods/{files})). However, this does not seem to be the correct pattern - the documentation states:
Please note that the zone folder is not included in the path as it will be taken from the configuration file. The path also must not start with a /.
Proposed solution
The way to fix this is to include the zone as a prefix, as the documentation suggests should happen. The glob_wildcards function is defined in snakemake/remote/__init__.py. The snakemake/remote/iRODS.pyRemoteProvider class can override the function as follows:
This will append the zone as a prefix prior to execution of the super function. This has resolved the issue for me.
Issue 2. Uploading functionality is broken for users without access to complete directory structure
Describe the bug
This issue lies in the RemoteObject class of snakemake/remote/iRODS.py:
def _upload(self):
# get current local timestamp
stat = os.stat(self.local_path)
# create folder structure on remote
folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
collpath = os.sep
for folder in folders:
collpath = os.path.join(collpath, folder)
try:
self._irods_session.collections.get(collpath)
except:
self._irods_session.collections.create(collpath)
The function splits up the directory structure, such that /zone/home/rods becomes ["zone", "home", "rods"] folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
Then for each one of these folders, it tries to either "get" it or "create" it. That is, it will try to access /zone, /zone/home and /zone/home/rods. This will throw an iRODS exception CAT_NO_ACCESS_PERMISSION and fail if any of these are inaccessible to the user. In my case, I only have access to the subdirectory /zone/home/rods, which means all uploading fails for me on the first iteration of the "for loop". I imagine that having limited access to the "root" /zone directory is not uncommon.
Proposed solution
What I did to fix this for myself is to check if the directory is not accessible before attempting to get/create it.
Include iRODS access exception: from irods.exception import CollectionDoesNotExist, DataObjectDoesNotExist, CAT_NO_ACCESS_PERMISSION
Skip over directory parts that are not accessible:
def denied_access(self, collpath):
try:
self._irods_session.collections.get(collpath)
return False
except(CAT_NO_ACCESS_PERMISSION):
return True
return False
def _upload(self):
# get current local timestamp
stat = os.stat(self.local_path)
# create folder structure on remote
folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
collpath = os.sep + folders.pop(0) + os.sep + folders.pop(0)
for folder in folders:
collpath = os.path.join(collpath, folder)
print(collpath)
if not self.denied_access(collpath):
try:
self._irods_session.collections.get(collpath)
except:
self._irods_session.collections.create(collpath)
The text was updated successfully, but these errors were encountered:
* fix: iRODS functionality - issue #1510
* fix: iRODS functionality - issue #1510
Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
* fmt with black
* iRODS correctly handles subdirectories
allows iRODS _upload function to either create (if missing) or ignore (if user has no access) a subdirectory
Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
Hello! Snakemake is quickly becoming one of my favourite tools and I was pleasantly surprised to find out there was support for iRODS. I've observed two issue however that I have debugged for myself and just wanted to share it here for the benefit of other users. I will format the two issues separately but they are related.
Snakemake version
7.3.0 (as far as I can tell these issue still exists on the main branch)
Issue 1.
glob_wildcards
requires anti-pattern to work properlyDescribe the bug
https://snakemake.readthedocs.io/en/stable/snakefiles/remote_files.html#irods
The example provided in the documentation has the following code:
The last line causes an issue by throwing
irods.exception.CollectionDoesNotExist
. This issue can be traced to the fact that the iRODS "zone" is not being included when runningglob_wildcards
. When I manually add the zone as a prefix to the path, it runs fine. (i.efiles, = irods.glob_wildcards('/zone/home/rods/{files})
). However, this does not seem to be the correct pattern - the documentation states:Proposed solution
The way to fix this is to include the zone as a prefix, as the documentation suggests should happen. The
glob_wildcards
function is defined insnakemake/remote/__init__.py
. Thesnakemake/remote/iRODS.py
RemoteProvider
class can override the function as follows:This will append the zone as a prefix prior to execution of the super function. This has resolved the issue for me.
Issue 2. Uploading functionality is broken for users without access to complete directory structure
Describe the bug
This issue lies in the
RemoteObject
class ofsnakemake/remote/iRODS.py
:The function splits up the directory structure, such that
/zone/home/rods
becomes ["zone", "home", "rods"]folders = os.path.dirname(self.remote_path).split(os.sep)[1:]
Then for each one of these folders, it tries to either "get" it or "create" it. That is, it will try to access
/zone
,/zone/home
and/zone/home/rods
. This will throw an iRODS exceptionCAT_NO_ACCESS_PERMISSION
and fail if any of these are inaccessible to the user. In my case, I only have access to the subdirectory/zone/home/rods
, which means all uploading fails for me on the first iteration of the "for loop". I imagine that having limited access to the "root" /zone directory is not uncommon.Proposed solution
What I did to fix this for myself is to check if the directory is not accessible before attempting to get/create it.
Include iRODS access exception:
from irods.exception import CollectionDoesNotExist, DataObjectDoesNotExist, CAT_NO_ACCESS_PERMISSION
Skip over directory parts that are not accessible:
The text was updated successfully, but these errors were encountered: