New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add option to ProjectImageEntry.readImageData() to avoid accessing the file image #1488
base: main
Are you sure you want to change the base?
Conversation
Thanks, this is an interesting idea and certainly seems to help in the scenario you describe. I'm apprehensive about merging quickly because it would be quite a significant change, and add complexity when we're trying to reduce it. So it will take some thought and I'd like to understand the problem better.
Can you explain why it takes so long? Huge numbers of images? Slow file format, or is it where the images are stored?
The This is because, when I rewrote object classifiers some years ago, I was thinking of future classifiers that will use deep learning models to classify based upon image patches - and not only measurements. That's why there is also a general This all basically works, we just haven't yet had time to wrap it up for wider use.
While not identical, the current You can then create a new So an alternative approach might be to try to script creating a classifier without needing to go through the UI, in a way that doesn't involve any big API changes - and which can be used when you can know in advance that the image doesn't need to be accessed. |
Thank you for taking your time into understanding our problem and coming up with possible alternative solutions! I absolutely get your notes on the classifier. I assumed that the ObjectClassifier would never need to read pixels because it currently only uses the detections' measurements. Surely the code must be changed in preparation of the advent of the
I am not sure about this, though. You often want to leverage the live-update feature when creating a classifier. That is one of the most handy feature when tweaking a classifier. If that option was removed, it would be unfortunate. However i think you missed a point. The major issue this PR wants to address is the ability to a script in batch as fast as possible (and when it is possible)
Just to be clear, this PR's only API change is adding in Now, I agree this interface may not be the best one as it could easily just be a checkbox option in the ScriptEditor. For now, though, i think it is enough to enjoy the benefits that this PR brings.
Basically All you the above. It accesses ~50 in this scenario, the fact that QuPath offer an highly-scriptable interface is many laboratories' luck. It means that a pipeline of scripts can be applied one after the other, resulting in a full analysis of the whole project. Sometimes, you may even want to tweak some parameters, and for this you have to re-run the whole pipeline in batch again. In the end, I really get you are wary of quickly merging this. I'm sure reducing code is a top priority in order to offer the best bug-free experience to QuPath's wide user-base. A base experience that can be extended by plug-ins and scripts outside of QuPath's code/responsibility. However, I feel like the ability to run scripts in QuPath is at the core of its streghts. It allows to offer an extensible basis on which downstream developers can then build upon. Thank you! |
f6182e1
to
837a0f2
Compare
With the latest commit I also added an option in the CLI interface to run the script for the whole project without accessing the image files. ./gradlew run --args="script -p '/home/castoldi/426FC/project.qpproj' -n -c 'import qupath.imagej.tools.IJTools; println IJTools.convertToImagePlus(getCurrentServer(), RegionRequest.createInstance(getCurrentServer(), 16))'" output:
|
Rough draft at an alternative approach to address the issue raised in qupath#1488
I think I do get the point, but want to ensure it's clear what exactly should be solved here, as I suspect there are alternative approached to consider. For example, I quickly drafted a rough alternative at #1489 This simply delays loading images until the It probably has disadvantages too, as calling code needs to be more careful not to request the server at all (even for metadata), to avoid triggering the image to be loaded. I'm not sure which is best, but we should go with the most maintainable solution that solves the main problem. |
I really like your draft. It seems the most beautiful, design-wise. The I made a small comment on your draft PR about the metadata, as i feel that is an important info to access offline. But in the end, if that was sorted out, that solution would be a drop-in replacement to mine. As you said, at last it will come down to which one is the most maintainable. I see pros and cons in both: Ultimately, I think your solution is better maintainable-wise, granted that a few things are managed:
|
…e file image This improves performance when running a script on a project-wide as well as when creating an object classifier
…ccessing the image files
837a0f2
to
525112d
Compare
WHAT
This PR improves performance when running a script that does not need to access the image files on multiple images.
Additionally, allowed to modify
ObjectClassifierCommand
too so that it can read all detections' measurements in the training set without uselessly reading the image files.This last change alone allowed, on my projects, to improve the time when creating an object classifier from ~10/15minutes to ~5seconds.
I feel like this change is useful when a laboratory works with large projects with image locations being on a remote server. This is also possible thanks to QuPath's amazing design choice to never directly modify images.
If the scripts being run wants to access the images' pixels, it gracefully halts the execution of the all the following project entries too.
example:
"Run for project (without saving and opening)":
HOW
Essentially this works by creating a
ImageServerStub
that extendsAbstractImageServer
. It retrieves metadata from the ProjectImageEntry itself (which in turn, i think, it gets them from the.qpproj
file) and fails whenreadRegion()
is being called. Additionally, it does not provide a server builder. This way, if the resulting image data are to be saved, the original ImageServer won't be overwritten/lost.You can now pass a
openImage
boolean toProjectImageEntry.readImageData()
that, when false, just avoids getting the default image server, but just uses an instance ofImageServerStub
.When running a
ProjectTask
, it will catch whether the script tried to access the image file. If it did, it stops the execution for the current image and all the following in the queue.Minor proposal
Finally, i'd like to discuss whether we could initially run all scripts with
ImageServerStub
by default and, only if they fail because they need to read the image files, run them with the correct ImageServer.This point, however, is not really important and can be addressed in a future PR as well.