Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom subclasses now require a fsspec filesystem #197

Open
joouha opened this issue Feb 19, 2024 · 3 comments
Open

Custom subclasses now require a fsspec filesystem #197

joouha opened this issue Feb 19, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation question Further information is requested

Comments

@joouha
Copy link
Contributor

joouha commented Feb 19, 2024

Previously it used to be possible to create arbitrary UPath subclasses like this:

from upath import UPath
from upath.registry import register_implementation

class MyPath(UPath):
    def exists(self):
        return True

register_implementation("my", MyPath)

print(UPath("my://path").exists())

However, UPath now checks for and requires a corresponding fsspec filesystem in the UPath._parse_storage_options method, raising a ValueError if one is not found:

Traceback (most recent call last):
  File "/home/josiah/code/universal_pathlib/./test.py", line 12, in <module>
    print(UPath("my://path").exists())
          ^^^^^^^^^^^^^^^^^^
  File "/home/josiah/code/universal_pathlib/upath/core.py", line 203, in __init__
    self._storage_options = type(self)._parse_storage_options(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/josiah/code/universal_pathlib/upath/core.py", line 290, in _parse_storage_options
    fs_cls: type[AbstractFileSystem] = get_filesystem_class(protocol)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/josiah/code/universal_pathlib/venv/lib/python3.11/site-packages/fsspec/registry.py", line 233, in get_filesystem_class
    raise ValueError(f"Protocol not known: {protocol}")
ValueError: Protocol not known: my

Would you consider handling this case and returning an empty storage options dictionary if no fsspec filesystem is found?

...

    @classmethod
    def _parse_storage_options(
        cls, urlpath: str, protocol: str, storage_options: Mapping[str, Any]
    ) -> dict[str, Any]:
        """Parse storage_options from the urlpath"""
        try:
            fs_cls: type[AbstractFileSystem] = get_filesystem_class(protocol)
        except ValueError:
            return {}
        else:
            pth_storage_options = fs_cls._get_kwargs_from_urls(urlpath)
            return {**pth_storage_options, **storage_options}
...

Thanks,
Josiah

@joouha joouha changed the title Custom subclasses now require a fsspec Custom subclasses now require a fsspec filesystem Feb 19, 2024
@ap--
Copy link
Collaborator

ap-- commented Feb 19, 2024

Yes, this is intentional, because we can't really predict what a filesystem class could extract from the uri as storage_options.

There are multiple ways to achieve what you want, including the one you suggest.

Could you describe your intention behind creating a UPath subclass for the "my" protocol, without defining a new AbstractFileSystem subclass for the "my" protocol?

@ap-- ap-- added documentation Improvements or additions to documentation question Further information is requested labels Feb 19, 2024
@joouha
Copy link
Contributor Author

joouha commented Feb 19, 2024

The protocol I'm creating a new subclass for is actually untitled: (I was just using my:path as an example). For my use case, UntitledPath objects don't have to do anything other than not exist.

When communicating with a LSP server, you use the file URI to signal which file you are referring to. Typically you use file:// URIs for this, but for the case of new unsaved files, you use untitled: as the URI scheme. As part of implementing LSP server support in euporie, I added a new UntitledPath class to use with untitled files to facilitate this.

It doesn't really matter, since as you say above it's very easy to work around. I just wasn't sure if this was a deliberate change or not.

Feel free to close this

@ap--
Copy link
Collaborator

ap-- commented Feb 19, 2024

Interesting! Thanks for sharing the use case.

cross-referencing: neovim/neovim#21276 because there are quite a few useful links to more context.

So if the "untitled" protocol is supposed to be a placeholder for in-memory files not written to disk, wouldn't the "memory" filesystem be an ideal candidate for it?

from fsspec.registry import register_implementation as fsspec_register_implementation
from fsspec.implementations.memory import MemoryFileSystem


class UntitledFileSystem(MemoryFileSystem):
    store = {}
    pseudo_dirs = [""]
    protocol = "untitled"
    root_marker = "/"

    @classmethod
    def _strip_protocol(cls, path):    
        if path.startswith("untitled://"):
            path = path[len("untitled://") :]
        return super()._strip_protocol(path)

fsspec_register_implementation("untitled", UntitledFileSystem)


from upath import UPath
from upath.registry import register_implementation as upath_register_implementation
from upath.implementations.memory import MemoryPath


class UntitledPath(MemoryPath):
    pass


upath_register_implementation("untitled", UntitledPath)


u = UPath("untitled:///abc/file.txt")
print(repr(u))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants