The CWL specification requires the workflow engines to support the URIs with the file
scheme but supporting other schemes such as https
, ftp
, and s3
is optional.
Therefore, using such optional schemes in the input objects prevents portability between workflow engines.
The core-wire aims to fix this problem by providing a way to download/upload files and directories in a given input object and returns a new input object with downloaded/uploaded URIs.
$ core-wire input.yaml file:///uri/to/the/destination
It accepts YAML and JSON files for the input object.
See core-wire -h
for details.
- It currently supports
file
,http
,https
, andftp
schemes by default.- Limitation: The
ftp
scheme requires thecurl
command. It will be fixed in the future release.
- Limitation: The
$ cat input.json
{
"param1": 10,
"param2": {
"class": "File",
"location": "https://remote/resource/file.txt"
}
}
$ core-wire input.json file:///current-dir/inp
{
"param1": 10,
"param2": {
"class": "File",
"location": "file:///current-dir/inp/file.txt",
"path": "/current-dir/inp/file.txt",
"basename": "file.txt",
"dirname": "/current-dir/inp",
"nameroot": "file",
"nameext": ".txt"
"checksum": "sha1$47a013e660d408619d894b20806b1d5086aab03b",
"size": 13
}
}
$ ls inp
file.txt
Note: not yet implemented
$ cat input.json
{
"param1": 10,
"param2": {
"class": "File",
"location": "file:///current-dir/inp/efa951dd-df01-4ce9-0008-39e7dbe25d6a/file.txt",
"path": "/current-dir/inp/efa951dd-df01-4ce9-0008-39e7dbe25d6a/file.txt",
"basename": "file.txt",
"dirname": "/current-dir/inp/efa951dd-df01-4ce9-0008-39e7dbe25d6a",
"nameroot": "file",
"nameext": ".txt"
"checksum": "sha1$47a013e660d408619d894b20806b1d5086aab03b",
"size": 13
}
}
$ core-wire --config=s3conf.json input.json s3://bucket/inp/
{
"param1": 10,
"param2": {
"class": "File",
"location": "s3://bucket/inp/e2949107-f856-2417-ce6c-1030af43f9ea/file.txt"
"basename": "file.txt",
"nameroot": "file",
"nameext": ".txt"
"checksum": "sha1$47a013e660d408619d894b20806b1d5086aab03b",
"size": 13
}
}
You can extend schemes by specifying the commands to download files and directories with --inline-dl-file-cmd
and --inline-dl-dir-cmd
.
The accepted value is as follows:
$scheme:$command
$scheme
is a URI scheme such asssh
ands3
. You can also override the default schemes.$command
is a command to download a file or a directory from a URI with a given scheme.- Example:
curl -f <src-uri> -o <dst-path>
- The
<src-uri>
and<src-path>
is replaced with a source URI or path of a file or a directory. - The
<dst-uri>
and<dst-path>
is replaced with a destination URI or path of a file or a directory.
- Example:
Here is a concrete example:
$ cat input.json
{
"param1": 10,
"param2": {
"class": "File",
"location": "ssh:///remote-server:/home/user/path-to/file.txt",
}
}
$ core-wire input.json file:///current-dir/inp --inline-dl-file-cmd=ssh:"scp <src-uri> <dst-path>"
{
"param1": 10,
"param2": {
"class": "File",
"location": "file:///current-dir/inp/file.txt",
"path": "/current-dir/inp/file.txt",
"basename": "file.txt",
"dirname": "/current-dir/inp",
"nameroot": "file",
"nameext": ".txt"
"checksum": "sha1$47a013e660d408619d894b20806b1d5086aab03b",
"size": 13
}
}
There is a limitation of the extended schemes:
- When
--inline-dl-file-cmd
is specified but--inline-dl-dir-cmd
is not,core-wire
rejects non-literal directory objects (i.e., only accept directory objects with thelisting
field).- If you have to handle non-literal directory objects, specify
--inline-dl-dir-cmd
in addition to--inline-dl-file-cmd
.
- If you have to handle non-literal directory objects, specify