Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for groups #84

Open
TomNicholas opened this issue Apr 16, 2024 · 1 comment
Open

Support for groups #84

TomNicholas opened this issue Apr 16, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@TomNicholas
Copy link
Owner

We should support generating references from files containing multiple groups in the same way that xr.open_dataset and datatree.open_datatree work.

So we should add a new open_virtual_datatree function, and a new (optional) group kwarg to open_virtual_dataset.

This can be done right now using the datatree package (as an optional dependency imported inside open_virtual_datatree) but once that gets merged into xarray main (which is happening right now) we can get rid of that dependency.

See #78 (comment) and #11.

cc @sharkinsspatial

@TomNicholas TomNicholas added the enhancement New feature or request label Apr 16, 2024
@TomNicholas
Copy link
Owner Author

TomNicholas commented Apr 18, 2024

One thing I realized about this is that concatenating multiple DataTree objects is currently a little awkward. I don't know if this is actually a common pattern, but imagine you had two netCDF files each with groups, and you wanted to concatenate group1 in file1 with group1 in file2 etc.

Adding open_virtual_datatree would allow you to open the files like this:

vdt1 = open_virtual_datatree('file1.nc')
vdt1 = open_virtual_datatree('file2.nc')

but currently you can't do

combined_vdt = xr.concat([vdt1, vdt1], dim='time')

because xr.concat doesn't understand DataTree objects. To get around this you should be able to do

from datatree import map_over_subtree

concat_datatrees = map_over_subtree(xr.concat)

combined_vdt = concat_datatrees([vdt1, vdt1], dim='time')

but it raises the question of whether the xarray DataTree upstream integration should include generalizing xr.concat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant