Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message #2166

Open
li-dailin opened this issue May 7, 2024 · 1 comment
Assignees

Comments

@li-dailin
Copy link

Hi, I'm having some trouble running nnUNet on the Synapse BCV dataset. I run it on a anaconda powershell prompt, and this error occurs the same for both 2D and 3D training. The version I am currently using is 2.2.1, I tried the latest version too and it reported the very same error. I noticed that there are multiple different processes and AttributeErrors, which are a bit different from what I saw in previous issues, and that's why I raised a separate one. The outputs are as follows:

2024-05-08 06:03:36.244731: unpacking dataset...
2024-05-08 06:04:31.883632: unpacking done...
2024-05-08 06:04:31.889076: do_dummy_2d_data_aug: False
2024-05-08 06:04:31.893605: Creating new 5-fold cross-validation split...
2024-05-08 06:04:31.899590: Desired fold for training: 0
2024-05-08 06:04:31.905573: This split has 24 training and 6 validation cases.
D:\conda\lib\site-packages\torch\onnx\symbolic_helper.py:1466: UserWarning: ONNX export mode is set to TrainingMode.EVAL, but operator 'instance_norm' is set to train=True. Exporting with train=True.
  warnings.warn(
2024-05-08 06:04:38.226775: Unable to plot network architecture:
2024-05-08 06:04:38.230765: failed to execute WindowsPath('dot'), make sure the Graphviz executables are on your systems' PATH
2024-05-08 06:04:38.259687:
2024-05-08 06:04:38.264674: Epoch 0
2024-05-08 06:04:38.269661: Current learning rate: 0.01
using pin_memory on device 0
Process Process-5:
Traceback (most recent call last):
  File "D:\conda\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "D:\conda\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 41, in producer
    with threadpool_limits(1, None):
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 171, in __init__
    self._original_info = self._set_threadpool_limits()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 268, in _set_threadpool_limits
    modules = _ThreadpoolInfo(prefixes=self._prefixes,
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 340, in __init__
    self._load_modules()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 373, in _load_modules
    self._find_modules_with_enum_process_module_ex()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 485, in _find_modules_with_enum_process_module_ex
    self._make_module_from_path(filepath)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Process Process-6:
Traceback (most recent call last):
  File "D:\conda\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "D:\conda\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 41, in producer
    with threadpool_limits(1, None):
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 171, in __init__
    self._original_info = self._set_threadpool_limits()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 268, in _set_threadpool_limits
    modules = _ThreadpoolInfo(prefixes=self._prefixes,
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 340, in __init__
    self._load_modules()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 373, in _load_modules
    self._find_modules_with_enum_process_module_ex()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 485, in _find_modules_with_enum_process_module_ex
    self._make_module_from_path(filepath)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Process Process-7:
Traceback (most recent call last):
  File "D:\conda\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "D:\conda\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 41, in producer
    with threadpool_limits(1, None):
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 171, in __init__
    self._original_info = self._set_threadpool_limits()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 268, in _set_threadpool_limits
    modules = _ThreadpoolInfo(prefixes=self._prefixes,
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 340, in __init__
    self._load_modules()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 373, in _load_modules
    self._find_modules_with_enum_process_module_ex()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 485, in _find_modules_with_enum_process_module_ex
    self._make_module_from_path(filepath)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Process Process-8:
Traceback (most recent call last):
  File "D:\conda\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "D:\conda\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 41, in producer
    with threadpool_limits(1, None):
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 171, in __init__
    self._original_info = self._set_threadpool_limits()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 268, in _set_threadpool_limits
    modules = _ThreadpoolInfo(prefixes=self._prefixes,
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 340, in __init__
    self._load_modules()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 373, in _load_modules
    self._find_modules_with_enum_process_module_ex()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 485, in _find_modules_with_enum_process_module_ex
    self._make_module_from_path(filepath)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
Process Process-9:
Exception in thread Traceback (most recent call last):
Thread-4  File "D:\conda\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "D:\conda\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
:
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 41, in producer
    with threadpool_limits(1, None):
Traceback (most recent call last):
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 171, in __init__
    self._original_info = self._set_threadpool_limits()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 268, in _set_threadpool_limits
    modules = _ThreadpoolInfo(prefixes=self._prefixes,
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 340, in __init__
    self._load_modules()
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 373, in _load_modules
    self._find_modules_with_enum_process_module_ex()
  File "D:\conda\lib\threading.py", line 980, in _bootstrap_inner
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 485, in _find_modules_with_enum_process_module_ex
    self._make_module_from_path(filepath)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "D:\conda\lib\site-packages\threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
      File "D:\conda\lib\site-packages\threadpoolctl.py", line 646, in get_version
    config = get_config().split()
self.run()
  File "D:\conda\lib\threading.py", line 917, in run
AttributeError: 'NoneType' object has no attribute 'split'
    self._target(*self._args, **self._kwargs)
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 125, in results_loop
    raise e
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 103, in results_loop
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
Traceback (most recent call last):
  File "D:\conda\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "D:\conda\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\user\AppData\Roaming\Python\Python39\Scripts\nnUNetv2_train.exe\__main__.py", line 7, in <module>
  File "f:\ldl\nnunet\nnunetv2\run\run_training.py", line 268, in run_training_entry
    run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
  File "f:\ldl\nnunet\nnunetv2\run\run_training.py", line 204, in run_training
    nnunet_trainer.run_training()
  File "f:\ldl\nnunet\nnunetv2\training\nnUNetTrainer\nnUNetTrainer.py", line 1279, in run_training
    train_outputs.append(self.train_step(next(self.dataloader_train)))
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 196, in __next__
    item = self.__get_next_item()
  File "C:\Users\user\AppData\Roaming\Python\Python39\site-packages\batchgenerators\dataloading\nondet_multi_threaded_augmenter.py", line 181, in __get_next_item
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message

I would appreciate it if you help :-)

@Karol-G
Copy link
Contributor

Karol-G commented May 13, 2024

Hey,

This seems to be the relevant error message:
AttributeError: 'NoneType' object has no attribute 'split'

Your split file seems to be corrupted. Try deleting it and then running the training again. It will generate a new one automatically.

Best regards,
Karol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants