-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support zstd compressed CZI #600
Conversation
DCO signed off ✔️All commits have been signed off. You have certified to the terms of the Developer Certificate of Origin, version 1.1. In particular, you certify that this contribution has not been developed using information obtained under a non-disclosure agreement or other license terms that forbid you from contributing it under the GNU Lesser General Public License, version 2.1. |
CZI has two zstd compression modes: zstd0 and zstd1. In zstd0 mode, pixel data is compressed with zstd and stored in subblock as it is. zstd1 mode is different in that it prefix the zstd compressed data with a header. This header is either 1 byte or 3 bytes long. The first byte of the header is its length. CZI may use a trick called high low byte unpack, which packs less significant byte of 16bits pixels in the first half of the image array, and more significant byte in the second half of the image array, before been compressed by zstd. This trick is used if: - the header length is 3, and - the second byte in the header is 1, and - the lowest bit of the third byte is 1 Obviously this trick only applies to 16bits grayscale images and 48bits color images. Signed-off-by: Wei Chen <chenw1@uthscsa.edu>
While almost all CZI slide files contains SizeS in xml metadata and 'S' dimension in dimension entry, an exception is the embedded SlidePreview CZI file, which missing Scene dimension in both xml metadata and dimension entry. The embedded SlidePreview is valid CZI as Zeiss ZEN software can view it, when it is extracted and save as individual file. The SlidePreview CZI is the only 48bits color image available, which makes them good candidates for testing zstd1 mode hi low bytes pack. Signed-off-by: Wei Chen <chenw1@uthscsa.edu>
Signed-off-by: Wei Chen <chenw1@uthscsa.edu>
Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Decode raw big-endian ARGB pixels. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Rename the function for consistency with _openslide_inflate_buffer(). Don't bother checking the size of the first compressed frame, since there might be more than one, and libzstd should fail if there isn't enough output space. Do check that the decompressed data matches the expected length. Use int64_t arguments rather than ones with arch-dependent widths. Use g_try_malloc(). Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
and add a redundant packed attribute. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Use a single function to process both uncompressed and zstd images, rather than duplicating code. Clean up zstd1 header parsing and add some additional error checks. Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
I've done some refactoring and added tests and some error checking. While I think the refactored version is a net improvement, This series seems clean enough to merge without squashing, so I've rebased to pick up some CI changes in Please take a look! I think this is ready to land. |
Thank you for the review! I tested the latest commit. It works. |
Great, thank you for the PR! The next two PRs will probably take longer to land. I assume the JXR one will be pretty straightforward to review, but it may not be able to land right away because of the libjxr situation. The SIMD one will likely require substantial effort to review and test. If it makes sense to submit both in parallel, feel free, and otherwise I'd suggest submitting the JXR one next. I may not be able to review the SIMD one for a couple months or more. |
CZI has two zstd compression modes: zstd0 and zstd1.
In zstd0 mode, pixel data is compressed with zstd and stored in subblock
as it is.
zstd1 mode is different in that it prefix the zstd compressed data with
a header. This header is either 1 byte or 3 bytes long. The first byte
of the header is its length. CZI may use a trick called high low byte
unpack, which packs less significant byte of 16bits pixels in the first
half of the image array, and more significant byte in the second half of
the image array, before been compressed by zstd. This trick is used if:
the header length is 3, and
the second byte in the header is 1, and
the lowest bit of the third byte is 1
Obviously this trick only applies to 16bits grayscale images and 48bits
color images.