Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Population module memory-related crash #1650

Open
10 tasks
MDijsselhof opened this issue Mar 15, 2024 · 1 comment
Open
10 tasks

Population module memory-related crash #1650

MDijsselhof opened this issue Mar 15, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@MDijsselhof
Copy link
Contributor

MDijsselhof commented Mar 15, 2024

Description

The population module loads images for all subjects but crashes if there are too many subjects and/or not enough memory available.

It crashes at the following point, using 2 CPUs with 2 Gb each (I've used very little memory just to speed up the crashing):

It crashes at 51%, in xASL_wrp_CreatePopulationTemplates starting from line 418.

Points Henk

  • Matlab is notoriously bad in memory management, so we have to explicitly clear variables, even if we implement the changes below (afaik).
  • I agree with your points below. I would indeed revamp xASL_wrp_CreatePopulationTemplates, but keep a copy of this old version in our active ExploreASL, commented out, which has the benefit that we will update it with our general updates (nomenclature/variables/BIDS etc).
  • This old version has several more options than only choosing the calculation type, e.g., smoothing, mirroring left-right; not sure if it is easy to keep all these options but I would like it.
  • Other calculations that I use this function for are minimal intensity projection (MIP) and maximal intensity projection (MIP) across the population, voxel-wise. This can be much more useful than mean or even sd for seeing things that everyone has but at different locations, e.g. vascular artifacts.
  • The previous also serves as an example that you probably haven't thought about, as an indication that Function2Use is a useful feature in the future (I'm sure I will like it to test another idea, and I actually find this group-wise processing a huge pre of our pipeline so for me a good reason to keep a copy of the old function. Or would it even be possible to keep Function2Use still as a different sub-function inside this function? So that we have two different sub-functions for loading data (the default one sums, the other loads all volumes to memory) and similarly for the calculation subfunction? (or perhaps not even subfunctions but code-sections, using boolean to switch between Function2Use and hardcoded mean+sd+mip+map).

Explanation for smoothing:
you have lesion maps per subject where lesions are often small spots (e.g. microbleeds or microinfarcts). Or you have

  • The smoothing can be useful for creating lesion heat maps, and the mirroring for things that always happen unilaterally (e.g. I used this for creating patient-group average vascular territories, depending on the amount of stenosis, which is always unilateral but sometimes left sometimes right).

Discussion Henk+Jan:

We will revamp this to avoid loading all NIfTIs in RAM.

We will load NIfTIs 1 by 1 and add them to the corresponding templates (SNR, SD, mean). Only the templates will stay in the memory.
Memory mapping and IM2Column will not be used as it only accelerates when data is used multiple times, which in most cases we do not do.
We will add a second-pass option (OFF on default) that will remove outliers.
We will only do SD, mean, etc, but not median as this is not possible with a single-pass option (HENK: we only used this for outlier removal so this won't change much)
We will keep all the computation tricks as they are, only the order of reading will change and the memory management.
We will also create per-session templates and per-group templates.

Tasks proposed by Jan

Changes done only inside xASL_wrp_CreatePopulationTemplates

  • When going through subjects 1-by-1 we will only be able to calculate mean+STD, not median. And since we never ever calculate anything else then I propose to keep only those two defaults and remove this parameter "FunctionsAre" as it makes things much more complicated. So remove all this and hard-code mean+sd
    HENK: it's not really nice, as it removes the possibility to do voxel-wise stats other than mean+sd. Can we keep it but default to hard-coded mean+sd?
    Note that this wrapper also has more possibilities like smoothing, mirroring. Perhaps, it is easier to revamp just a subfunction of this function, according to your suggestion. Keeping the old subfunction as an option, so also keeping the smoothing, mirroring, median, etc as option. But just defaulting to the new subfunction.
  • Line 236, remove calculation or 'mrc*'
    HENK: I would comment is out, not remove it. Could still be useful for some users as QC, to see what the modulation does. But I agree that we don't really use it so can comment it out.
  • Remove xASL_im_Column2IM from everywhere (in xASL_wrp_CreatePopulationTemplates)
    HENK: Like above, if we can keep the old method in an old subfunction but have the new hardcoded subfunction that is used by default, that would have my preference.
  • Line 414-440 - only load an image, skip masking (will be done before averaging), skip im2column
    HENK: but this would be tricky if you want to smooth per volume?
  • Line 452: How are outliers calculated - per subject or across population? It seems that it's calculated per subject and added per subject to a population mask. So we might be able to calculate this on the first pass.
    HENK: population-wise, not per subject. This is about voxel-wise outliers through population, not spatial outliers (that is done in the ASL module)
  • Function xASL_wrp_CreatePopulationTemplates_Computation - create a copy split to a saving part and calculation part and replace in the main function body.
  • In the main function body - make sure that mean+SD are calculated per subject and saved to a template. That way - we will use minimal memory even for a large population. This still doesn't fix the issue with xASL_wrp_CreatePopulationTemplates4Sets
    HENK: "are calculated per subject" you mean are "summed subject-wise" right?
  • xASL_wrp_CreatePopulationTemplates4Sets - split to a init function that is run only once per ScanType and creates all types of templates
  • xASL_wrp_CreatePopulationTemplates4Sets - split to a process function that adds one subject to each of these templates
  • xASL_wrp_CreatePopulationTemplates4Sets - split to a saving function that saves all these templates
    HENK: xASL_wrp_CreatePopulationTemplates4Sets -> can we not simplify by
  1. having a (sub)function that defines which subjectsessions to calculate mean+sd for (which defaults to all)
  2. loading & summing

How to test

Optional: insert description about how to test the code changes here

Release notes

Required: summarize the changes for the release notes here

@MDijsselhof MDijsselhof added the bug Something isn't working label Mar 15, 2024
@MDijsselhof MDijsselhof added this to the Release 1.12.0 milestone Mar 15, 2024
@jan-petr
Copy link
Contributor

jan-petr commented Mar 24, 2024

Discussion Henk+Jan:

We need to revamp this to avoid loading the whole thing to the memory.

  1. We will load NIfTIs 1 by 1 and add them to the corresponding templates (SNR, SD, mean). Only the templates will stay in the memory.
  2. Memory mapping and IM2Column will not be used as it only accelerates when data is used multiple times, which in most cases we do not do.
  3. We will add a second-pass option (OFF on default) that will remove outliers.
  4. We will only do SD, mean, etc, but not median as this is not possible with a single-pass option (HENK: we only used this for outlier removal so this won't change much)
  5. We will keep all the computation tricks as they are, only the order of reading will change and the memory management.
  6. We will also create per-session templates and per-group templates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants