Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: Can't pickle local object 'SliceDataset.__init__.<locals>.<lambda>' #335

Open
King-pand opened this issue Mar 24, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@King-pand
Copy link

Please:When the number of my GPUs is set to greater than 1, the following error will occur. How can I fix this BUG?
Traceback (most recent call last):
File "/home/23-panjiawei/MRI_code/unet_knee_sc_leaderboard.py", line 195, in
run_cli()
File "/home/23-panjiawei/MRI_code/unet_knee_sc_leaderboard.py", line 190, in run_cli
cli_main(args)
File "/home/23-panjiawei/MRI_code/unet_knee_sc_leaderboard.py", line 67, in cli_main
trainer.fit(model, datamodule=data_module)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/multiprocessing.py", line 113, in launch
mp.start_processes(
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 2 terminated with the following error:
Traceback (most recent call last):
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/multiprocessing.py", line 139, in _wrapping_function
results = function(*args, **kwargs)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1112, in _run
results = self._run_stage()
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1191, in _run_stage
self._run_train()
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1204, in _run_train
self._run_sanity_check()
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1276, in _run_sanity_check
val_loop.run()
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 152, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 194, in run
self.on_run_start(*args, **kwargs)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 84, in on_run_start
self._data_fetcher = iter(data_fetcher)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 178, in iter
self.dataloader_iter = iter(self.dataloader)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 444, in iter
return self._get_iterator()
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 390, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1077, in init
w.start()
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/home/23-panjiawei/anaconda3/envs/pytorch/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'SliceDataset.init..'

  • [1] Check for duplicate issues.
  • [2] Provide a simple example for how to reproduce the bug.
  • [3] If applicable, include full error messages/tracebacks.
@King-pand King-pand added the bug Something isn't working label Mar 24, 2024
@mmuckley
Copy link
Contributor

mmuckley commented Apr 9, 2024

Hello @King-pand, I haven't been able to reproduce this. It looks like there's an issue with your code in terms of how you set up DistributedDataParallel, perhaps due to a cluster misconfiguration.

@JianliWei1995
Copy link

JianliWei1995 commented Apr 10, 2024

Hi @King-pand, I got the same error as yours. I found this error caused since pickle cannot take lambda function in a __init__ state of SliceDataset in mri_data.py. I solved this error by replacing self.raw_sample_filter = lambda raw_sample: True with a regular function as follows:

class filter_raw_sample():
    def call(self, raw_sample):
    return True

This class object should be built up ahead of SliceDataset object.

@Tracy-git
Copy link

maybe you can set numworkers=0 in dataloader = torch.utils.data.DataLoader(dataset, num_workers=4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants