Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent target_chunks api behavior between zarr group and xarray dataset #76

Open
rabernat opened this issue Dec 31, 2020 · 0 comments

Comments

@rabernat
Copy link
Member

rabernat commented Dec 31, 2020

The docs say the following about the target_chunks argument when rechunking a group:

For a group of arrays, a dict is required. The keys correspond to array names. The values are target_chunks arguments for the array. For example, {'foo': (20, 10), 'bar': {'x': 3, 'y': 5}, 'baz': None}. All arrays you want to rechunk must be explicitly named. Arrays that are not present in the target_chunks dict will be ignored.

Xarray datasets are very similar to Zarr groups. However, the behavior is a bit different with Xarray datasets. This difference is documented in the tests, but not the docs. Here is the target_chunks parameter for test_rechunk_dataset

"target_chunks",
[{"a": (20, 10), "b": (20,)}, {"a": {"x": 20, "y": 10}, "b": {"x": 20}}],

Note that the variable c is not present. However, it is present in the output dataset:
assert dst.a.data.chunksize == target_chunks_expected
assert dst.b.data.chunksize == target_chunks_expected[:1]
assert dst.c.data.chunksize == source_chunks[1:]

The original chunks have been preserved, a reasonable default.

We should strive to reconcile, or at least document, this difference. My personal preference would be to change the API so that at flat zarr group behaves the same as the xarray dataset: variables that are not mentioned in target_chunks simply get passed through with identical chunks.

cc @eric-czech who wrote the test_rechunk_dataset so probably understands this part of the code the best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant