High memory consumption upon instantiation of DataLoader with GridSampler #892

nicoloesch · 2022-06-01T03:56:11Z

nicoloesch
Jun 1, 2022

Hi,

I am running into an issue with my current data loader. I am trying to implement a patch-based data pipeline for the training of medical samples. I am trying to make it already future-proof for subsequent datasets I will obtain throughout my project. Therefore, I use both in training and validation the torchio.Queue with a respective sampler. However, for inference, I want to use the torchio.GridSampler and torchio.GridAggregator.
For the patch based pipeline, I currently use batch_size=5 and patch_size=80, patch_overlap=40. As a result, the patches are of 5x4x80x80x80 size, which results in around 40 MB of RAM per batch. To work well with pytoch_lightning, I initially used a List of torch.utils.data.Dataloader for each of the subjects

for subject_ in subjects_dataset:
    dataset = GridSampler(subject=subject_, patch_size=self.args.patch_size,patch_overlap=self.args.patch_overlap)
    getattr(self, f"subjects_{str(stage)}").aggregator['pred'].append(GridAggregator(dataset))
    getattr(self, f"subjects_{str(stage)}").aggregator['target'].append(GridAggregator(dataset))
                
    dataloader.append(DataLoader(dataset=dataset,
                                                       batch_size=self.args.batch_size,
                                                       num_workers=self.args.num_workers,
                                                       pin_memory=self.args.pin_memory))

By running this code, the data loading takes around 8 GB of RAM for 20 subjects, which exceeds the amount of memory that is available on our machine once I scale it up to the 100 samples I intend to use for testing.

I tried to modify the following:

reduce the number of workers from the initial 8 to 4 to 0 (as the Dataloader loads prefetch_factor*2 of samples)
prefetch_factor of the Dataloader set to 1 instead of 2 (to load only a single sample, as 0 was not allowed)

but nothing of these alterations reduced the memory consumption significantly. Subsequently, I tried to use the torchio.GridSampler in combination with the torchio.Queue but I have to estimate the samples_per_volume beforehand. If I would ever include samples with different sizes, this approach would no longer hold (maybe a bad idea in general but potentially resampling to a common size could be used).

Additionally, I tried to implement a SequentialLoader, which would be returned as the test_dataloader:

class SequentialLoader:
    def __init__(self, subjects_dataset, patch_size, patch_overlap, batch_size, num_workers, pin_memory):
        self.subjects_dataset = subjects_dataset
        self.patch_size = patch_size
        self.patch_overlap = patch_overlap
        self.batch_size = batch_size
        self.pin_memory = pin_memory
        self.num_workers = num_workers
        self.num_patches = None

    def __len__(self):
        return self.num_patches

    def __iter__(self):
        for subj in self.subjects_dataset:
            dataset = GridSampler(
                subject=subj,
                patch_size=self.patch_size,
                patch_overlap=self.patch_overlap
            )
            dataloader = DataLoader(dataset=dataset,
                                    batch_size=self.batch_size,
                                    num_workers=self.num_workers,
                                    pin_memory=self.pin_memory)
            self.num_patches = len(dataset) * len(self.subjects_dataset)
            yield from dataloader

but the initial estimation of __len__ is not trivial but required by pytorch_lightning (unless I add up all batches for all samples - will require more work if that is the solution I am going with).

My final question is: Is there any better way to load the samples into RAM only when I need them (lazy loading) to reduce the memory footprint i.e. in the training_step with

inputs = batch['image'][DATA]
targets = batch['label'][DATA]

as described in the TorchIO Documentation?

An instance of Image can be created using a filepath, a PyTorch tensor, or a NumPy array. This class uses lazy loading, i.e., the data is not loaded from disk at instantiation time. Instead, the data is only loaded when needed for an operation (e.g., if a transform is applied to the image)

Or am I doing something severely wrong?

There is also some documentation about an IterableDataset and the implementation of lazy loading but I assume this is exactly what the torchio.data.SubjectDataset has already implemented (which I am using prior to the Dataloaders).

I thank you already in advance for your help. Hopefully, I included enough helpful information about my problem.
And thank you very much for this amazing library!

Cheers,
Nico

Answered by fepegar

May 4, 2024

Hi, @nicoloesch. This comes terribly late, apologies. I'm trying to go through all unanswered questions.

I don't understand why you'd want a data loader per subject. I think this might be what's causing the issue.

View full answer

fepegar · 2024-05-04T22:41:23Z

fepegar
May 4, 2024
Maintainer

Hi, @nicoloesch. This comes terribly late, apologies. I'm trying to go through all unanswered questions.

I don't understand why you'd want a data loader per subject. I think this might be what's causing the issue.

1 reply

nicoloesch May 4, 2024
Author

Hi @fepegar, thanks for following this up! I literally have forgotten about this Q&A and have fixed it since then.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory consumption upon instantiation of DataLoader with GridSampler #892

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

High memory consumption upon instantiation of DataLoader with GridSampler #892

nicoloesch Jun 1, 2022

Replies: 1 comment · 1 reply

fepegar May 4, 2024 Maintainer

nicoloesch May 4, 2024 Author

nicoloesch
Jun 1, 2022

Replies: 1 comment 1 reply

fepegar
May 4, 2024
Maintainer

nicoloesch May 4, 2024
Author