sharding as a codec rather than array storage transformer #220

jbms · 2023-03-09T00:42:07Z

In the ZEP meeting on 2023-03-02, @jstriebel, @normanrz and I discussed representing sharding (ZEP 2) in the metadata as a codec rather than a storage transformer:

{"chunk_grid": {
  "name": "regular",
  "configuration": {"chunk_shape": [1000, 1000, 1000]}},
  "codecs": [{"name": "sharding_indexed", "configuration": {
    "chunk_grid": {"name": "regular", "configuration": [100, 100, 100]}, 
    "codecs": [{"name": "blosc"}]
  }}],
 "shape": [50000, 50000, 50000],
 "data_type": "uint16",
 "zarr_format": 3,
 "node_type": "array",
 "fill_value": 0
}

In general, the idea here is to specify the partitioning of the array top-down, i.e. the chunk_grid specifies the top-level partitioning, and then the codec is responsible for any further subdivision, while with the previous proposal to represent sharding with a storage transformer, we are specifying the partitioning bottom-up, i.e. the grid_grid specifies the bottom-level partitioning, and then the storage transformer builds up larger shards from the individual chunks.

While the sharding format as proposed in ZEP2 works fine as a storage transformer, in terms of the spec there are some advantages to making it a codec:

Previously sharding had to "hijack" chunk_key_encoding, which should be specifying the keys for chunks, to instead specify the keys for shards; at the same time, the chunk_key_encoding was still used to create fake chunk keys for the individual chunks that served only as an identifier to pass around internally within the implementation, since the sharding format does not store keys. With sharding as a codec, each shard is in fact a chunk and there is no need for string keys for the sub-chunks.

The greater benefit to representing sharding as a codec is in enabling some interesting future extensions beyond the sharding format of ZEP2:

The NVIDIA nvcomp project is interested in a sharding format where the compression is specified within the shard format, and may differ across shards. With sharding as a codec, this is straightforward and very natural. With sharding as a storage transformer, this can be done but basically requires that we leave the codecs empty in the array metadata.
It is easy to imagine a sharding format where the sub-chunking of each shard is specified as part of a header/footer in the shard itself, and may differ across shards. That is impossible to support with storage transformers, because the bottom-level chunking may vary across the different referenced arrays.

There are a few things that are possible with storage transformers but are not possible with a codec:

Shards corresponding to non-rectangular regions of the array.
Shards encoded as more than one key on the underlying store.

However, it is not clear that there is a use case for those things, and it would still be possible to use a storage transformer as well.

@normanrz has updated zarrita to support zarr v3 with sharding as a codec: https://github.com/scalableminds/zarrita/blob/v3/zarrita/sharding.py

The text was updated successfully, but these errors were encountered:

joshmoore · 2023-03-09T17:31:15Z

Primary outcome from the ZEP meeting discussion: for those interested in this issue, we should really work through the API (changes) for each of the main abstractions in Zarr V3.

jstriebel · 2023-03-15T09:37:51Z

@jbms @normanrz I just had a chat with @joshmoore, and we had one more argument to keep sharding as a storage transformer: When using sharding as a codec, the implementation must use partial reads to be able to read single chunks. This basically boils it down to: Are implementations assumed to use/support a byte-range API, or is this rather an optional optimization. To keep zarr as simple as possible I tend towards the latter, and would therefore argument to keep sharding as a storage transformer.

In any case, I'd vote to keep storage transformers in ZEP 1, since we do have a working example in zarr-python, even if it turns out that sharding should be a codec in the end. If not, this discussion would certainly block v3 and ZEP 1, which otherwise is almost ready for voting.

normanrz · 2023-03-15T10:27:55Z

When using sharding as a codec, the implementation must use partial reads to be able to read single chunks.

I wouldn't see it that way. Partial reads are implementation details that implementations can choose to implement. If they don't implement it, reading works the same as accessing a slice of a chunk in Zarr right now (i.e. read the full thing and make a virtual cut-out). For writing, it is like that, anyways. Shards essentially become chunks (with sub chunks). Of course, partial reads make sense and are recommended.

In any case, I'd vote to keep storage transformers in ZEP 1, since we do have a working example in zarr-python, even if it turns out that sharding should be a codec in the end. If not, this discussion would certainly block v3 and ZEP 1, which otherwise is almost ready for voting.

No objections. But I still think codec is the better abstraction for sharding.

jbms · 2023-03-16T06:37:11Z

When using sharding as a codec, the implementation must use partial reads to be able to read single chunks.

I wouldn't see it that way. Partial reads are implementation details that implementations can choose to implement. If they don't implement it, reading works the same as accessing a slice of a chunk in Zarr right now (i.e. read the full thing and make a virtual cut-out). For writing, it is like that, anyways. Shards essentially become chunks (with sub chunks). Of course, partial reads make sense and are recommended.

Agreed. I'm also not sure I understand the concern being raised here. If sharding is through a storage transformer, then the underlying store still needs to support byte range reads in order to read just a single chunk within the shard; that requirement is the same regardless of whether sharding is through a storage transformer or a codec. The only difference is that if sharding is through a codec, then the codec interface must also support partial reads. But that difference is essentially just a matter of the internal organization of the code, rather than a difference in capabilities that must be implemented for each store.

In any case, I'd vote to keep storage transformers in ZEP 1, since we do have a working example in zarr-python, even if it turns out that sharding should be a codec in the end. If not, this discussion would certainly block v3 and ZEP 1, which otherwise is almost ready for voting.

No objections. But I still think codec is the better abstraction for sharding.

Also no objections from me.

jstriebel added the core-protocol-v3.0 Issue relates to the core protocol version 3.0 spec label Mar 13, 2023

jbms mentioned this issue Mar 16, 2023

Describe partial decode support for codecs #222

Merged

jstriebel removed the core-protocol-v3.0 Issue relates to the core protocol version 3.0 spec label Mar 31, 2023

rabernat mentioned this issue Apr 11, 2023

TIFF: internal codec (small chunks) vs. entire file as single chunk (imagecodecs_tiff codec) fsspec/kerchunk#325

Open

normanrz mentioned this issue Jul 27, 2023

ZEP0002 Review #254

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sharding as a codec rather than array storage transformer #220

sharding as a codec rather than array storage transformer #220

jbms commented Mar 9, 2023

joshmoore commented Mar 9, 2023

jstriebel commented Mar 15, 2023

normanrz commented Mar 15, 2023

jbms commented Mar 16, 2023

sharding as a codec rather than array storage transformer #220

sharding as a codec rather than array storage transformer #220

Comments

jbms commented Mar 9, 2023

joshmoore commented Mar 9, 2023

jstriebel commented Mar 15, 2023

normanrz commented Mar 15, 2023

jbms commented Mar 16, 2023