You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Detect and handle HTTP range requests to enable clients to retrieve a portion of a file without the need to download the entire content. This feature would allow MetacatUI and other clients to preview data files before downloading them. It would also allow clients to resume downloads in the event of a network interruption.
@mbjones mentioned the possibility of implementing this without making changes to the DataONE API: A range request could be made via HTTP headers, leaving the request body unchanged and having Metacat only handle the range request headers.
It would be the client's responsibility to generate the range request, for example:
The feature could be a non-mandatory enhancement, such that the existing behavior remains consistent for repositories not making use of range requests.
Apache Tomcat and the Servlet API might provide built-in support for HTTP range requests.
A discussion is needed on how this feature interacts with event metrics:
Is a range request categorized as a download or a view/read?
Does this require a new event type, e.g. "partial read", "preview"?
Note: https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/urn%3Auuid%3A24b85258-3e86-40cb-accc-28153513dea8 gives a 100,000 line CSV file that could be useful for testing
The text was updated successfully, but these errors were encountered:
@taojing2002 good questions. Range requests are byte-based requests, basically specifiying a byte range to be requested. They are application-agnostic, and assume that the client knows what to do with the bytes. Tools like curl use range requests to allow resuming downloads if a network connection is interrupted. Data systems use range requests to retrieve chunks of data from inside a data file, but that is of course only useful if the data files are organized in such a way that contiguous byte ranges produce meaningful chunks. So, for text files, getting the first few KB is a good way to get a preview, but the client would need to be aware that the byte boundary is unlikely to correspond with the end-of-line delimiter used in that format. In contrast, netCDF, HDF5, and Zarr are binary formats that allow byte range requests that can get specific segments of data that correspond to specific scientifically meaningful chunks (e.g., a single image out of a time series, or a specific spatial window out of a larger extent). Hope that's all helpful.
Detect and handle HTTP range requests to enable clients to retrieve a portion of a file without the need to download the entire content. This feature would allow MetacatUI and other clients to preview data files before downloading them. It would also allow clients to resume downloads in the event of a network interruption.
curl -H "Range: bytes=0-1000" https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/urn%3Auuid%3A24b85258-3e86-40cb-accc-28153513dea8
Note:
https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/urn%3Auuid%3A24b85258-3e86-40cb-accc-28153513dea8
gives a 100,000 line CSV file that could be useful for testingThe text was updated successfully, but these errors were encountered: