New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of "HTTP/1.1 byte range request" in file retrieval #1599
Comments
I'll second this. It would be very useful e.g. for genomics datasets to be accessed directly with tabix. It seems to require a config change in the zenodo web server setting 'max_ranges' to a positive number. Is there some technical reason not to do that? |
Our file storage backend at the moment is not optimized to serve HTTP range requests (meaning that enabling this feature would potentially lead to significant slowdowns for the file upload/download API). Of course, there are people working on making it possible, though we can't give an accurate ETA on it... |
I just wanted to add my 👍 to state that enabling range requests would be very useful for geospatial data formats. Cloud Optimized GeoTIFF in particular would benefit a lot from this. Allowing range requests could really reduce the bandwidth needed from zenodo. |
Many people cannot download large genetic files (several GB). e.g., Some has to retry many times, and that's actually wasting your bandwidth... |
For our project also important that we can use Cloud-Optimized GeoTIFFs (see e.g. https://zenodo.org/record/4483227) directly from Zenodo. Figshare apparently works with COG's, zenodo does not? We wrote a tutorial for users how to get small chunks of data using COG files. |
Could you please support this? We need it to serve large image files (in Zarr format) by chunks, that allows us visualize the files in the browser instantly. It won't be possible to for the browser to download the, e.g.10GB, file and display. |
Just noting the value for the Zarr use case. Thanks all for your work on Zenodo! |
For Zarr, we could hypothetically get zenodo working today, without any changes. Zenodo does not support directories, but if we could map a regular zarr directory store to some sort of flat hierarchy, via a special character, we could make it work. For example, if the special character is
etc. |
Could you please raise an issue here ( https://github.com/zarr-developers/zarr-specs/issues )? |
@rabernat I afraid that won't scale because Zenodo only allow 100 files at maximum.
|
HTTP Range support is now available for file downloads! # Fetches only the last two lines of the CSV file
curl -r -182 https://zenodo.org/record/5702574/files/articles_by_influence.csv |
Wow this is huge! |
This is awesome! However, with the recent update, CORS is disabled ;( Submitted an issue here: #2246 |
Just found this, and is quite awesome that zenodo supports this. What is the recommended way to upload a zarr store in zenodo, assuming the zarr store can have a lot of subfiles? |
It's probably best to use a zipped store, which is discoverable and accessible via range requests. |
Thank you Ryan! I wonder if/how I could then open the zipped store using xarray. It looks like the zarr zipstore only accepts files from the local filesystem https://zarr.readthedocs.io/en/stable/api/storage.html#zarr.storage.ZipStore |
Yeah I think you have to go through fsspec. Here's an example: pangeo-forge/staged-recipes#90 (comment) |
I have one feature request on zenodo - can the zenodo server support HTTP/1.1 byte range request https://tools.ietf.org/html/rfc7233 ?
Zenodo platform is already incredible, and your support of the byte range request will increase the value of deposited data further since some applications have relied on byte range request, in particular when dealing large files.
I'd like to add an example on how the byte range request works, to make my point clear. For example, github (raw.githubusercontent.com) support the byte range request as below:
However, the byte range request is ignored in zenodo.org
The text was updated successfully, but these errors were encountered: