Large NDPI file fix #602

ntrahearn · 2024-05-15T15:26:18Z

Addresses issue #174 (and possibly others), regarding the reading of large NDPI files. This issue stems from the fact that classical TIFF (and by extension, NDPI) images only support up to 32 bit values for tagged metadata. However, due to the nature of whole slide image data, it isn't uncommon for the size of an NDPI to exceed the 32 bit range. This means that key metadata, such as the byte positions of image layers or their size in bytes, may be too large to store in a traditional TIFF IFD entry.

Currently OpenSlide relies on heuristics to determine the high bits of 64 bit addresses, which fail in some cases. However, this is unnecessary, as NDPI actually stores the high bits of the offset/value of each tag in 4 byte blocks immediately after the end of the IFD.

This fix modifies openslide-decode-tifflike.c to append these extra 4 bytes to each IFD entry's value/offset and, if necessary, modifies its type to LONG8.

This fix also modifies openslide-vendor-hamamatsu.c to construct correct restart marker addresses. Currently only the values in TIFF tag 65426 are used, which are only the lower 32 bits of each address. High bits are stored in TIFF tag 65432, so these are now appended before mcu_starts are calculated

openslide-bot · 2024-05-15T15:26:53Z

DCO signed off ✔️

All commits have been signed off. You have certified to the terms of the Developer Certificate of Origin, version 1.1. In particular, you certify that this contribution has not been developed using information obtained under a non-disclosure agreement or other license terms that forbid you from contributing it under the GNU Lesser General Public License, version 2.1.

Addresses issue openslide#174 (and possibly others), regarding the reading of large NDPI files. This issue stems from the fact that classical TIFF (and by extension, NDPI) images only support up to 32 bit values for tagged metadata. However, due to the nature of whole slide image data, it isn't uncommon for the size of an NDPI to exceed the 32 bit range. This means that key metadata, such as the byte positions of image layers or their size in bytes, may be too large to store in a traditional TIFF IFD entry. Currently OpenSlide relies on heuristics to determine the high bits of 64 bit addresses, which fail in some cases. However, this is unnecessary, as NDPI actually stores the high bits of the offset/value of each tag in 4 byte blocks immediately after the end of the IFD. This fix modifies openslide-decode-tifflike.c to append these extra 4 bytes to each IFD entry's value/offset and, if necessary, modifies its type to LONG8. This fix also modifies openslide-vendor-hamamatsu.c to construct correct restart marker addresses. Currently only the values in TIFF tag 65426 are used, which are only the lower 32 bits of each address. High bits are stored in TIFF tag 65432, so these are now appended before mcu_starts are calculated. Signed-off-by: Nick Trahearn <n.a.trahearn@gmail.com> Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>

bgilbert · 2024-05-29T00:32:37Z

Looks good, thank you! I've rebased, updated to the current internal API, and fixed up some formatting. I have a few more cleanup commits coming, and will fix the tests.

ntrahearn mentioned this pull request May 15, 2024

Add support for NDPI images >4GB #276

Closed

bgilbert force-pushed the modern-os-fix branch from dc0d0c9 to e6c4a42 Compare May 28, 2024 22:48

bgilbert force-pushed the modern-os-fix branch from e6c4a42 to d8cbd06 Compare May 28, 2024 23:04

bgilbert linked an issue May 29, 2024 that may be closed by this pull request

Certain (very large) Hamamatsu NDPI files cannot be opened with OpenSlide #174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large NDPI file fix #602

Large NDPI file fix #602

ntrahearn commented May 15, 2024

openslide-bot commented May 15, 2024 •

edited

bgilbert commented May 29, 2024

Large NDPI file fix #602

Are you sure you want to change the base?

Large NDPI file fix #602

Conversation

ntrahearn commented May 15, 2024

openslide-bot commented May 15, 2024 • edited

DCO signed off ✔️

bgilbert commented May 29, 2024

openslide-bot commented May 15, 2024 •

edited