support zstd compressed CZI #600

iewchen · 2024-05-15T14:20:33Z

CZI has two zstd compression modes: zstd0 and zstd1.

In zstd0 mode, pixel data is compressed with zstd and stored in subblock
as it is.

zstd1 mode is different in that it prefix the zstd compressed data with
a header. This header is either 1 byte or 3 bytes long. The first byte
of the header is its length. CZI may use a trick called high low byte
unpack, which packs less significant byte of 16bits pixels in the first
half of the image array, and more significant byte in the second half of
the image array, before been compressed by zstd. This trick is used if:

the header length is 3, and
the second byte in the header is 1, and
the lowest bit of the third byte is 1

Obviously this trick only applies to 16bits grayscale images and 48bits
color images.

openslide-bot · 2024-05-15T14:21:13Z

DCO signed off ✔️

All commits have been signed off. You have certified to the terms of the Developer Certificate of Origin, version 1.1. In particular, you certify that this contribution has not been developed using information obtained under a non-disclosure agreement or other license terms that forbid you from contributing it under the GNU Lesser General Public License, version 2.1.

CZI has two zstd compression modes: zstd0 and zstd1. In zstd0 mode, pixel data is compressed with zstd and stored in subblock as it is. zstd1 mode is different in that it prefix the zstd compressed data with a header. This header is either 1 byte or 3 bytes long. The first byte of the header is its length. CZI may use a trick called high low byte unpack, which packs less significant byte of 16bits pixels in the first half of the image array, and more significant byte in the second half of the image array, before been compressed by zstd. This trick is used if: - the header length is 3, and - the second byte in the header is 1, and - the lowest bit of the third byte is 1 Obviously this trick only applies to 16bits grayscale images and 48bits color images. Signed-off-by: Wei Chen <chenw1@uthscsa.edu>

While almost all CZI slide files contains SizeS in xml metadata and 'S' dimension in dimension entry, an exception is the embedded SlidePreview CZI file, which missing Scene dimension in both xml metadata and dimension entry. The embedded SlidePreview is valid CZI as Zeiss ZEN software can view it, when it is extracted and save as individual file. The SlidePreview CZI is the only 48bits color image available, which makes them good candidates for testing zstd1 mode hi low bytes pack. Signed-off-by: Wei Chen <chenw1@uthscsa.edu>

Signed-off-by: Wei Chen <chenw1@uthscsa.edu>

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

Decode raw big-endian ARGB pixels. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

Rename the function for consistency with _openslide_inflate_buffer(). Don't bother checking the size of the first compressed frame, since there might be more than one, and libzstd should fail if there isn't enough output space. Do check that the decompressed data matches the expected length. Use int64_t arguments rather than ones with arch-dependent widths. Use g_try_malloc(). Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

and add a redundant packed attribute. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

Use a single function to process both uncompressed and zstd images, rather than duplicating code. Clean up zstd1 header parsing and add some additional error checks. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

bgilbert · 2024-05-21T06:18:20Z

I've done some refactoring and added tests and some error checking. While I think the refactored version is a net improvement, czi_read_raw() is somewhat unwieldy, and may need to be revisited when we get to JXR support.

This series seems clean enough to merge without squashing, so I've rebased to pick up some CI changes in main.

Please take a look! I think this is ready to land.

src/openslide-vendor-zeiss.c

iewchen · 2024-05-21T15:01:18Z

Thank you for the review!

I tested the latest commit. It works.

bgilbert · 2024-05-22T02:40:42Z

Great, thank you for the PR!

The next two PRs will probably take longer to land. I assume the JXR one will be pretty straightforward to review, but it may not be able to land right away because of the libjxr situation. The SIMD one will likely require substantial effort to review and test. If it makes sense to submit both in parallel, feel free, and otherwise I'd suggest submitting the JXR one next. I may not be able to review the SIMD one for a couple months or more.

iewchen force-pushed the zeiss-czi-zstd branch from eeaa47a to ad4d42d Compare May 16, 2024 20:56

iewchen and others added 12 commits May 21, 2024 01:13

add zstd to CI

71cf44d

Signed-off-by: Wei Chen <chenw1@uthscsa.edu>

meson: move zstd out of conditional dependency section

b9c6070

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

README: add zstd to dependency list

05bf232

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

synthetic: add zstd decoder

50f12eb

Decode raw big-endian ARGB pixels. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

zeiss: rename zstd struct to match naming convention

b24b371

and add a redundant packed attribute. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

zeiss: refactor zstd decompression

c7a7042

Use a single function to process both uncompressed and zstd images, rather than duplicating code. Clean up zstd1 header parsing and add some additional error checks. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

zeiss: rearrange nscene parsing a bit

52ff67c

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

misc/imhex/zeiss-czi: decode zstd1 header struct

dbb790f

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

tests: add Zeiss zstd primary tests

edb6a0a

Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

bgilbert force-pushed the zeiss-czi-zstd branch from 22a9fca to edb6a0a Compare May 21, 2024 06:14

iewchen commented May 21, 2024

View reviewed changes

src/openslide-vendor-zeiss.c Show resolved Hide resolved

bgilbert merged commit 637b213 into openslide:main May 22, 2024
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support zstd compressed CZI #600

support zstd compressed CZI #600

iewchen commented May 15, 2024

openslide-bot commented May 15, 2024 •

edited

bgilbert commented May 21, 2024

iewchen commented May 21, 2024

bgilbert commented May 22, 2024

support zstd compressed CZI #600

support zstd compressed CZI #600

Conversation

iewchen commented May 15, 2024

openslide-bot commented May 15, 2024 • edited

DCO signed off ✔️

bgilbert commented May 21, 2024

iewchen commented May 21, 2024

bgilbert commented May 22, 2024

openslide-bot commented May 15, 2024 •

edited