-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a neutral format to have lossless interface with JSON, scipp, Astropy, pandas #8927
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
It's not clear to me what changes you're asking for in xarray. If you want to create a new on-disk storage format you can, and you can teach xarray to read it using the backend entrypoint system. Are you asking for something that falls outside of that framework? |
Thank you @TomNicholas for your quick response. Currently, the interface between Xarray and other multidimensional tools like scipp or NDData only process part of the data because the internal structures of each tool are different. To have reversible 'lossless round-trip' interfaces it is necessary to define a common data structure and a mapping between this structure and the structure of the tool (here Xarray). This is what was defined in the proposed format and implemented in the indicated package. This shows for example that an Xarray Dataset can be transformed reversibly into a Scipp Dataset and vice versa or even into JSON data in an equally reversible manner. To be clearer, my requests for Xarray are as follows:
|
Thanks for the clarification @loco-philippe .
That's cool to know! I'll attempt to answer these questions, but others feel free to correct me.
I don't really think we need to be active participants until you ask for a specific change in xarray. Our data model is well-defined, and would need a very good reason to change.
Note that xarray maps well to the zarr format, which already stores all metadata in JSON files. If the numerical data arrays themselves can also be serialized to JSON (e.g. through numpy/numpy#12481), then you have a JSON representation of an entire
Xarray deliberately tries to make it easy for third parties to write code to serialize/deserialize to any data format they like. Again see our backend entrypoint system. I don't see a need to add any |
Is your feature request related to a problem?
Each tool has a specific structure for processing multidimensional data with the following consequences:
Describe the solution you'd like
The proposed format (see jupyter notebook, github repository, PyPI package ) is based on the following principles:
Describe alternatives you've considered
No response
Additional context
numpy/numpy#12481 (comment)
astropy/astropy#16286
scipp/scipp#3422
The text was updated successfully, but these errors were encountered: