Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworking LMSMetadata package and adding a new lif reader #4000

Open
wants to merge 91 commits into
base: develop
Choose a base branch
from

Conversation

XLEFReaderForBioformats
Copy link
Contributor

Main changes:

  • reworking metadata translation (and class architecture) in the LMSMetadata package, improving metadata translation especially for SP8 and STELLARIS images, also reworking metadata translation for other images (SP5, widefield)
  • Adding a new LMSLIFReader for reading .lif files that also uses the LMSMetadata package

In general, metadata translation will be an ongoing topic here, and there might also be images from systems that are not yet fully regarded and where metadata translation needs to be improved. I'll try to fix all major bugs that might be noticed in the context of this PR, but concerning minor bugs and "missing metadata" , I guess I have to discuss with my colleagues how much of that I will be able to fix within this PR.

@dgault as we have discussed before, since you suggested that we could use a DelegateReader here to provide both the existing LIFReader and the new LMSLIFReader to users, I think that's a reasonable approach, so this PR at least won't create a deterioration of existing .lif files reading.
I removed my temporal changes to readers.txt and LIFReader.isThisType(...) for testing so now the LIFReader is the standard reader for .lif files again. Could you let me know how using a delegate reader would look like so both lif readers can be used?

Please let me know if you need any additional information or explanations from my side.

…t allowed in prolog.

adapted error messaging
…h uses LMS metadata classes (pending discussion).

temporary changes to LIFReader.isThisType() and readers.txt to use LIFReader2 instead of LIFReader for testing
…sequential list to channel detector settings
…stinguishes between SPX and STELLARIS, fixing channelPrios for LIF/LOF
…instruments for images without hardware settings, empty timestamps, correct interleaving depending on image format
@XLEFReaderForBioformats
Copy link
Contributor Author

XLEFReaderForBioformats commented Mar 11, 2024

Hi @dgault ,

I am now ready to implement your suggested DelegateReader solution. You linked ND2Reader as an example, is this info still up to date? I found DelegateReader examples in JPEGReader and TiffDelegateReader which look comprehensible to me at first sight, in ND2Reader I couldn't find where the logic for using different readers would be. A clarification would be welcome. :)

edit: nvm, I oriented towards the TiffDelegateReader and it's working ^^ (not sure though if there are any methods missing in LIFDelegateReader that I should add)

@dgault
Copy link
Member

dgault commented Mar 12, 2024

The ND2 Reader example is out of date now as we removed the legacy ND2 reader in Bio-Formats 7.0.0. The older version can be seen here, but the other Delegate readers are going to be very similar. Looking quickly at the last commit and it looks to be correct.

I am going to mark this for inclusion with our nightly repository tests to give us an initial idea of the impact on existing filesets. I will update the PR once the tests have run to see the results.

@dgault dgault added the include label Mar 12, 2024
@XLEFReaderForBioformats
Copy link
Contributor Author

okay great, thanks for letting me know! 👍 excited for the results 😄

@dgault dgault removed the include label Mar 13, 2024
@dgault
Copy link
Member

dgault commented Mar 13, 2024

This PR was included in the CI last night, unfortunately a number of files threw exceptions at the setID stage so the full test suite didn't actually get to run. I have only had a quick look at the failures but a lot of them look to be NullPointerExceptions. Unfortunately the files that are failing are not ones that are public.

The full console output from the test run is at:
https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-test-folder/lastFailedBuild/console

@XLEFReaderForBioformats
Copy link
Contributor Author

hi @dgault, okay thanks for sharing, I'll try to fix all the issues where exceptions are thrown.

@XLEFReaderForBioformats
Copy link
Contributor Author

hi @dgault , I tried to address all the exceptions that happened in the tests, trying to increase the robustness against unexpected input.
However, I do not yet understand the root causes of these errors, which are images with data that the reader does not expect.
I understand that you cannot share these image data publicly, however is it maybe possible to make this image data available to me directly?

@NicoKiaru
Copy link
Contributor

NicoKiaru commented Mar 20, 2024

@XLEFReaderForBioformats I'm curious to know whether your request will be approved because I've had exactly the same issue in here:
#4092

This point has also been raised and answered in the image.sc forum:
https://forum.image.sc/t/pb-when-i-open-czi-stitching-image/87585/24

@XLEFReaderForBioformats
Copy link
Contributor Author

@NicoKiaru thanks for sharing, good to know...

@dgault I have a suggestion... I assume that the problems of the new reader with your test data arise from unexpected XML layout and not so much from the image binary data itself. Would it be an option for you to share only the image XML with me privately?

If you have only LIF files available, you could copy the XML using a hex editor, or alternatively you could e.g. use a LASX Offline to open LIF files and save them as e.g. XLEF/TIF files, and then it would suffice for me if you sent me the XLIF files, which can be found in metadata folders.
You can also obfuscate any image names, file paths and user-comments in the XML.

Having the image XML, I can create some "fake" image datasets with it for local testing.

Would this be an option?

…properties"), wrong image offsets are used

mapping of memory blocks to images found in XML via memory block ids, removing offsets of "invalid" memory blocks.
frame properties and other children of image nodes are not considered valid images
@XLEFReaderForBioformats
Copy link
Contributor Author

hi @dgault @melissalinkert,
could you possibly support me with the latest build errors? my changes built locally, and for some reason I cannot see the build logs of the failed builds (it only displays "this job failed")...

@dgault
Copy link
Member

dgault commented Apr 8, 2024

@XLEFReaderForBioformats, Im not sure why those jobs failed in that way but I re-triggered them again and they should now all be green

@XLEFReaderForBioformats
Copy link
Contributor Author

@dgault I'd like to ask if there are any updates regarding the test data? It would be helpful for me to know at least if you are planning to share any data with me or not, right now it is a bit unclear to me how to proceed with this PR.

@dgault
Copy link
Member

dgault commented May 1, 2024

Hi @XLEFReaderForBioformats, the next step will be to reinclude the PR in the nightly CI, testing the new reader rather than the legacy reader. If we can get a full run of the test suite then we will at least be able to identify the number of files which are causing failures and judge the scale of the impact. From there we can decide how best to proceed.

I will re-include the PR now so we can have the new reader in the nightly CI for the purposes of the testing.

@dgault dgault added the include label May 1, 2024
@dgault
Copy link
Member

dgault commented May 7, 2024

Hi @XLEFReaderForBioformats, as a follow up, the PR has been included in our test suite for a number of days and we are seeing failures for a number of files, though most are for the same exception. You can see the LIF failures here:
https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-test-folder/7512/consoleFull

   [testng] java.lang.IndexOutOfBoundsException: Index 751 out of bounds for length 13
   [testng] 	at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
   [testng] 	at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
   [testng] 	at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
   [testng] 	at java.base/java.util.Objects.checkIndex(Objects.java:385)
   [testng] 	at java.base/java.util.ArrayList.get(ArrayList.java:427)
   [testng] 	at loci.formats.in.LMSLIFReader.initFile(LMSLIFReader.java:461)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1480)
   [testng] 	at loci.formats.in.LIFDelegateReader.setId(LIFDelegateReader.java:37)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:864)

The failures seem to be caused by the following line when retrieving the offsets:
https://github.com/ome/bioformats/pull/4000/files#diff-79e103bcccaa0d842ec61c1bf11e438d3ace6607e364135a7e08b3947f2dec7cR461

The following sample files which failed are available for testing from here. These should be good samples to test for the IndexOutOfBoundsException

  • michael/PR2729_frameOrderCombinedScanTypes.lif
  • imagesc-30856/20191025 Test FRET 585. 423, 426.lif
  • seanwarren/150519_FRAP_test_ROIs_chromagreen/150519_FRAP_test_ROIs_chromagreen.lif

@dgault dgault removed the include label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants