Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3fs kwarg unexpected keyword argument in AioSession #209

Open
bjhardcastle opened this issue Mar 19, 2024 · 2 comments
Open

s3fs kwarg unexpected keyword argument in AioSession #209

bjhardcastle opened this issue Mar 19, 2024 · 2 comments
Labels
question Further information is requested

Comments

@bjhardcastle
Copy link

bjhardcastle commented Mar 19, 2024

Might be related to #204

I'm trying to use the cache_type kwarg for s3 [source], but this causes issues down the line when the file is accessed:

>>> import upath
>>> url =  "s3://codeocean-s3datasetsbucket-1u41qdg42ur9/39490bff-87c9-4ef2-b408-36334e748ac6/nwb/ecephys_620264_2022-08-02_15-39-59_experiment1_recording1.nwb"

>>> path = upath.UPath(url, cache_type="first")
>>> path
S3Path('s3://codeocean-s3datasetsbucket-1u41qdg42ur9/39490bff-87c9-4ef2-b408-36334e748ac6/nwb/ecephys_620264_2022-08-02_15-39-59_experiment1_recording1.nwb')

>>> path.exists()
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\upath\core.py", line 711, in exists
    return self.fs.exists(self.path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\fsspec\asyn.py", line 118, in wrapper
    return sync(self.loop, func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\fsspec\asyn.py", line 103, in sync
    raise return_result
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\fsspec\asyn.py", line 56, in _runner
    result[0] = await coro
                ^^^^^^^^^^
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 1035, in _exists
    await self._info(path, bucket, key, version_id=version_id)
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 1302, in _info
    out = await self._call_s3(
          ^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 341, in _call_s3
    await self.set_session()
  File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 502, in set_session
    self.session = aiobotocore.session.AioSession(**self.kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: AioSession.__init__() got an unexpected keyword argument 'cache_type'

upath: 0.2.2
python: 3.11.5

@ap-- ap-- added the question Further information is requested label Mar 20, 2024
@ap--
Copy link
Collaborator

ap-- commented Mar 20, 2024

Hi @bjhardcastle

Please note the difference between storage options for AbstractFileSystems and options for the their open() methods:

S3FileSystem class
https://github.com/fsspec/s3fs/blob/efbe1e4c23a06e65b3df6a82f28fc49bab0dbd78/s3fs/core.py#L273-L297

The UPath constructor gathers all keyword arguments under **storage_options and uses those to instantiate the specific filesystem class.

S3FileSystem._open() method
https://github.com/fsspec/s3fs/blob/efbe1e4c23a06e65b3df6a82f28fc49bab0dbd78/s3fs/core.py#L611-L625

If you want to pass specific options down to the filesystem specific AbstractBufferedFile implementation, you would use the following in your case:

import upath
upath.UPath("s3://mybucket/myfile.txt").open(cache_type="first")

If you want to set this on the Filesystem level for s3fs you can do:

import upath
p = upath.UPath("s3://mybucket/myfile.txt", default_cache_type="first")
...
p.open()  # will use the default_cache_type 

Let me know if that helps! It would be wonderful, if you could tell me how I could improve the text in the README to make this more intuitive. PRs are super welcome too!

Cheers,
Andreas 😃

@bjhardcastle
Copy link
Author

Hi Andreas,

Thank you very much for explaining in detail. That of course fixed it!

I don't think it was a problem with the README in this case, but the wording for the open() method (which I assumed came from pathlib):
image
Because it says "as the built-in does", I never would have thought to pass it config for the fsspec-related operations.

One of the reasons I use upath is so I don't need to set-up anything manually, it just handles whatever I throw at it! Now I'm trying to use different configurations I'll refer to the documentation more and let you know if any parts aren't clear.

Cheers,
ben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants