You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to test the 256 ImageNet model on the deblurring task on the OOD data you provide in your adiacent repository. I'm getting this error:
ERROR - main.py - 2022-07-25 10:25:13,026 - Traceback (most recent call last):
File "/Users/mbejan/Documents/diffusion/ddrm/main.py", line 164, in main
runner.sample()
File "/Users/mbejan/Documents/diffusion/ddrm/runners/diffusion.py", line 161, in sample
self.sample_sequence(model, cls_fn)
File "/Users/mbejan/Documents/diffusion/ddrm/runners/diffusion.py", line 249, in sample_sequence
for x_orig, classes in pbar:
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/site-packages/tqdm/std.py", line 1195, in __iter__
for obj in iterable:
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 438, in __iter__
return self._get_iterator()
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 384, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1048, in __init__
w.start()
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Users/mbejan/opt/anaconda3/envs/ddrm/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'Diffusion.sample_sequence.<locals>.seed_worker'
This is the script that creates the behaviour from above:
#18 is related. I also had the same error. Adding global seed_worker to Diffusion.sample_sequence in diffusion.py fails to resolve issue:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\shaw\Anaconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\shaw\Anaconda3\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'seed_worker' on <module 'runners.diffusion' from 'C:\\Users\\shaw\\Documents\\Year 2\\Diffusion Models\\ddrm\\runners\\diffusion.py'>
The reason (in my case) is that when running on Windows the multiprocessing module uses spawn and so one must (according to docs):
Wrap most of you main script’s code within if name == 'main': block, to make sure it doesn’t run again (most likely generating error) when each worker process is launched. You can place your dataset and DataLoader instance creation logic here, as it doesn’t need to be re-executed in workers.
Make sure that any custom collate_fn, worker_init_fn or dataset code is declared as top level definitions, outside of the main check. This ensures that they are available in worker processes. (this is needed since functions are pickled as references only, not bytecode.)
It is difficult to implement this advice since the seed_worker function needs access to the input args coming from the config file.
Simplest "solution" was to just set the worker_init_fn argument to None as below (within Diffusion.sample_sequence):
I'm trying to test the 256 ImageNet model on the deblurring task on the OOD data you provide in your adiacent repository. I'm getting this error:
This is the script that creates the behaviour from above:
My
imagenet_256_cc.yml
is the same as the one your provide apart from theout_of _distribution
argument, which is set totrue
.The text was updated successfully, but these errors were encountered: