Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception raised while executing Node run_afq #700

Open
Magic-Ludo opened this issue Feb 22, 2024 · 1 comment
Open

Exception raised while executing Node run_afq #700

Magic-Ludo opened this issue Feb 22, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Magic-Ludo
Copy link

Summary

I ran a reconstruction using the mrtrix_multishell_msmt_pyafq_tractometry pipeline. Everything went well for most of the subjects, but I don't know why, for some subjects, I get an error at the very end of the reconstruction:

-- NO ERROR BEFORE --
INFO:AFQ:Generating colorful lines from tractography...
INFO:AFQ:Preparing ROI...
INFO:AFQ:Preparing ROI...
INFO:AFQ:Preparing ROI...
INFO:AFQ:Preparing ROI...
INFO:nipype.workflow:[Node] Finished "run_afq", elapsed time 52692.52664s.
WARNING:nipype.workflow:Storing result file without outputs
WARNING:nipype.workflow:[Node] Error on "qsirecon_wf.sub-CTR16_mrtrix_multishell_msmt_pyafq_tractometry.sub_CTR16_ses_01_space_T1w_desc_preproc_recon_wf.pyafq_tractometry.run_afq" (/scratch/lcorcos/Temp_QSIPREP_TRACTO_V4/qsirecon_wf/sub-CTR16_mrtrix_multishell_msmt_pyafq_tractometry/sub_CTR16_ses_01_space_T1w_desc_preproc_recon_wf/pyafq_tractometry/run_afq)
ERROR:nipype.workflow:Node run_afq failed to run on host gpu011.cluster.
ERROR:nipype.workflow:Saving crash info to /scratch/lcorcos/EcriPark_QS_Tracto/qsirecon/sub-CTR16/log/20240221-165211_c6845cae-0072-45c0-a431-ae53d3260c4f/crash-20240222-103208-lcorcos-run_afq-6374e835-3483-4bd8-a408-f78cdd1b2ee7.txt
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node run_afq.

Traceback:
	Traceback (most recent call last):
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 397, in run
	    runtime = self._run_interface(runtime)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/qsiprep/interfaces/pyafq.py", line 106, in _run_interface
	    myafq.export_all()
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/participant.py", line 182, in export_all
	    export_all_helper(self, seg_algo, xforms, indiv, viz)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/utils.py", line 142, in export_all_helper
	    api_afq_object.export("indiv_bundles_figures")
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/participant.py", line 153, in export
	    return self.wf_dict[attr_name]
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 470, in __getitem__
	    self._run_node(self.plan.efferents[k])
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 534, in _run_node
	    if not found: res = node(self)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 91, in __call__
	    result = self.function(*args)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/tasks/viz.py", line 276, in viz_indivBundle
	    viz_backend.create_gif(figure, fname)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/viz/plotly_backend.py", line 410, in create_gif
	    figure.write_image(tdir + f"/tgif{i}.png")
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/basedatatypes.py", line 3821, in write_image
	    return pio.write_image(self, *args, **kwargs)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/io/_kaleido.py", line 268, in write_image
	    img_data = to_image(
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/io/_kaleido.py", line 145, in to_image
	    img_bytes = scope.transform(
	  File "/usr/local/miniconda/lib/python3.8/site-packages/kaleido/scopes/plotly.py", line 161, in transform
	    raise ValueError(
	ValueError: Transform failed with error code 525: Array buffer allocation failed


INFO:nipype.workflow:[MultiProc] Running 0 tasks, and 0 jobs ready. Free memory (GB): 283.34/283.34, Free processors: 28/28.
INFO:nipype.workflow:***********************************
ERROR:nipype.workflow:could not run node: qsirecon_wf.sub-CTR16_mrtrix_multishell_msmt_pyafq_tractometry.sub_CTR16_ses_01_space_T1w_desc_preproc_recon_wf.pyafq_tractometry.run_afq
INFO:nipype.workflow:crashfile: /scratch/lcorcos/EcriPark_QS_Tracto/qsirecon/sub-CTR16/log/20240221-165211_c6845cae-0072-45c0-a431-ae53d3260c4f/crash-20240222-103208-lcorcos-run_afq-6374e835-3483-4bd8-a408-f78cdd1b2ee7.txt
INFO:nipype.workflow:***********************************
/usr/local/miniconda/lib/python3.8/site-packages/joblib/externals/loky/backend/resource_tracker.py:310: UserWarning: resource_tracker: There appear to be 22 leaked folder objects to clean up at shutdown
  warnings.warn(
CRITICAL:cli:QSIPrep failed: Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node run_afq.

Traceback:
	Traceback (most recent call last):
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 397, in run
	    runtime = self._run_interface(runtime)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/qsiprep/interfaces/pyafq.py", line 106, in _run_interface
	    myafq.export_all()
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/participant.py", line 182, in export_all
	    export_all_helper(self, seg_algo, xforms, indiv, viz)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/utils.py", line 142, in export_all_helper
	    api_afq_object.export("indiv_bundles_figures")
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/participant.py", line 153, in export
	    return self.wf_dict[attr_name]
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 470, in __getitem__
	    self._run_node(self.plan.efferents[k])
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 534, in _run_node
	    if not found: res = node(self)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 91, in __call__
	    result = self.function(*args)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/tasks/viz.py", line 276, in viz_indivBundle
	    viz_backend.create_gif(figure, fname)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/viz/plotly_backend.py", line 410, in create_gif
	    figure.write_image(tdir + f"/tgif{i}.png")
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/basedatatypes.py", line 3821, in write_image
	    return pio.write_image(self, *args, **kwargs)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/io/_kaleido.py", line 268, in write_image
	    img_data = to_image(
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/io/_kaleido.py", line 145, in to_image
	    img_bytes = scope.transform(
	  File "/usr/local/miniconda/lib/python3.8/site-packages/kaleido/scopes/plotly.py", line 161, in transform
	    raise ValueError(
	ValueError: Transform failed with error code 525: Array buffer allocation failed


Traceback (most recent call last):
  File "/usr/local/miniconda/bin/qsiprep", line 8, in <module>
    sys.exit(main())
  File "/usr/local/miniconda/lib/python3.8/site-packages/qsiprep/cli/run.py", line 677, in main
    qsiprep_wf.run(**plugin_settings)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/workflows.py", line 638, in run
    runner.run(execgraph, updatehash=updatehash, config=self.config)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/base.py", line 224, in run
    raise error from cause
RuntimeError: Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node run_afq.

Traceback:
	Traceback (most recent call last):
	  File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 397, in run
	    runtime = self._run_interface(runtime)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/qsiprep/interfaces/pyafq.py", line 106, in _run_interface
	    myafq.export_all()
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/participant.py", line 182, in export_all
	    export_all_helper(self, seg_algo, xforms, indiv, viz)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/utils.py", line 142, in export_all_helper
	    api_afq_object.export("indiv_bundles_figures")
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/api/participant.py", line 153, in export
	    return self.wf_dict[attr_name]
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 470, in __getitem__
	    self._run_node(self.plan.efferents[k])
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 534, in _run_node
	    if not found: res = node(self)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/pimms/calculation.py", line 91, in __call__
	    result = self.function(*args)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/tasks/viz.py", line 276, in viz_indivBundle
	    viz_backend.create_gif(figure, fname)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/AFQ/viz/plotly_backend.py", line 410, in create_gif
	    figure.write_image(tdir + f"/tgif{i}.png")
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/basedatatypes.py", line 3821, in write_image
	    return pio.write_image(self, *args, **kwargs)
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/io/_kaleido.py", line 268, in write_image
	    img_data = to_image(
	  File "/usr/local/miniconda/lib/python3.8/site-packages/plotly/io/_kaleido.py", line 145, in to_image
	    img_bytes = scope.transform(
	  File "/usr/local/miniconda/lib/python3.8/site-packages/kaleido/scopes/plotly.py", line 161, in transform
	    raise ValueError(
	ValueError: Transform failed with error code 525: Array buffer allocation failed

Additional details

  • QSIPrep version: 0.18.1
  • Singularity version: apptainer version 1.2.2-1.el7

Using a compute node with this configuration:
Dell PowerEdge C4130 (28 cores) Intel Xeon CPU E5-2680 v4, 320 GB RAM

Reproducing the bug

I tried to run the following script:

#!/bin/bash

#SBATCH -J QS_Tracto_Mis
#SBATCH -p pascal
#SBATCH -A b356
#SBATCH -N 1
#SBATCH -t 70:00:00
#SBATCH --cpus-per-task=28
#SBATCH --mem=300G
#SBATCH --array=1-5%2
#SBATCH --output=/home/lcorcos/logs/QSIPREP_tracto/%j-stdout.txt
#SBATCH --error=/home/lcorcos/logs/QSIPREP_tracto/%j-stderr.txt
#SBATCH --mail-type=BEGIN,END,FAIL,TIME_LIMIT
#SBATCH --mail-user=ludovic.corcos@gmail.com

set -e

date

EcriPark="/scratch/lcorcos/EcriPark_QSIPREP/qsiprep/"
cd /home/lcorcos
source .bashrc

SUB=$(sed -n "${SLURM_ARRAY_TASK_ID}p" /home/lcorcos/EcriPark_Code/sub_missing.txt)

singularity run --cleanenv \
    -B ${HOME}/EcriPark_Code:/code,/scratch/lcorcos/EcriPark/,/scratch/lcorcos/EcriPark_FreeSurfer/,/scratch/lcorcos/EcriPark_QSIPREP/,/home/lcorcos/freesurfer/license.txt,/scratch/lcorcos/EcriPark_QS_Tracto/,/scratch/lcorcos/Temp_QSIPREP_TRACTO_V4/ \
    ${HOME}/qsiprep-0.18.1.sif \
    /scratch/lcorcos/EcriPark/ \
    /scratch/lcorcos/EcriPark_QS_Tracto/ participant \
    --participant_label ${SUB} \
    --recon_input /scratch/lcorcos/EcriPark_QSIPREP/qsiprep/ \
    --skip_bids_validation \
    --nthreads 28 \
    --omp-nthreads 28 \
    --work_dir /scratch/lcorcos/Temp_QSIPREP_TRACTO_V4/ \
    --recon_spec /code/mrtrix_multishell_msmt_pyafq_tractometryV2.json \
    --freesurfer-input /scratch/lcorcos/EcriPark_FreeSurfer/ \
    --recon-only \
    --skip_odf_reports \
    --fs_license_file /home/lcorcos/freesurfer/license.txt \
    --verbose

date

The "Array buffer allocation failed" problem reminds me of a memory shortage problem. I had initially tried with 128 GB RAM, currently I'm at 300 GB and I can't increase more and the problem is still there. For the other subjects, it worked fine with 128 GB RAM.

@Magic-Ludo Magic-Ludo added the bug Something isn't working label Feb 22, 2024
@arokem
Copy link
Contributor

arokem commented Feb 24, 2024

Might be related to XVFB configuration, see this comment: plotly/orca#223 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants