Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add specialized copying routines #9

Open
ap-- opened this issue Sep 21, 2022 · 0 comments
Open

Add specialized copying routines #9

ap-- opened this issue Sep 21, 2022 · 0 comments

Comments

@ap--
Copy link
Contributor

ap-- commented Sep 21, 2022

We could incrementally add support for faster copy.
This would mean, in pado.shutil we'd switch to a different copy implementation based on the (IN, OUT) FileSystem class of fsspec...

Something like this maybe better integrated:

import os
from pado.images import ImageProvider
from pado.io.files import urlpathlike_to_fs_and_path

def write_download_script(
    image_provider: ImageProvider, destination: str | os.PathLike
) -> str:
    """
    get a bash-script for downloading images

    Parameters
    ----------
    image_provider:
        thats what you want to copy
    destination:
        local path

    Returns
    -------
    str:
        the bash script
    """
    finished_file = os.path.join(destination, "done")
    script = [
        "#!/bin/sh",
        "FILES=$(cat << EOF",
    ]
    for urlpath in image_provider.df.urlpath:

        fs, path = urlpathlike_to_fs_and_path(urlpath)
        assert "gs" in fs.protocol
        if fs.storage_args or fs.storage_options:  # type: ignore
            raise NotImplementedError("todo")
        bucket_url = f"gs://{path}"
        script.append(bucket_url)
    script.extend([
        "EOF",
        ")",
        f"echo \"$FILES\" | gsutil -m cp -I {destination!s}"
        f"touch {finished_file}",
        "",
    ])

    return "\n".join(script)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant