imageset: should do nothing but get the raw data #1023
Replies: 3 comments
-
As I see it I think the intention and reality has probably become confused as the models evolved over time. My interpretation is that What actually happens, is that on ExperimentList load, it sets the models on the imageset from each experiment as a way to avoid loading them from file. Then, when accessing imageset models it uses those models instead of re-reading the file metadata to try to recreate the model information. The fact that it tries to re-read the file for empty models is a bug that is the root cause of a lot of the stills performance issues, that was hacked around with the Clearly the As for the right design - I think you are right - the ImageSet should care only about processing pixels - masks, gains and images. We do, however need an interface to the file - one that maps to multiple There's no reason, even with weird XFEL data, to read the metadata from a file after import, right? I think |
Beta Was this translation helpful? Give feedback.
-
FWIW, the intent of stills process is to only read the metadata once. It takes advantage of LazyImageSet to avoid reading the models during import. It only reads them once experiments have been distributed to worker nodes. This part of it is highly performant. Of course, dials.import with stills can't use this feature, which slows it down a lot since it can only run single process (I know there is work on this in other pull requests on this, haven't gotten to it yet, very excited about it :) I generally agree though. I wanted to add a 'initialized' flag of some sort to Experiment to prevent multiple hits on the raw data, but I think it didn't make sense at the time. With that though, the cached models on ImageSet could be dropped. |
Beta Was this translation helpful? Give feedback.
-
This issue has been automatically marked as stale because it has not had recent activity. The label will be removed automatically if any activity occurs. Thank you for your contributions. |
Beta Was this translation helpful? Give feedback.
-
At the moment the image set and experiments both keep track of the same models, or multiple copies of the same model, which is at best a source of complete confusion.
I propose we refactor things such that the imageset does exactly one thing: give pixel data. The experiment models handle everything else.
I welcome other viewpoints.
This is currently holding back development.
Beta Was this translation helpful? Give feedback.
All reactions