Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with "femur --groom_images" on MacOS #1179

Closed
akenmorris opened this issue Mar 25, 2021 · 23 comments · Fixed by #1180
Closed

Crash with "femur --groom_images" on MacOS #1179

akenmorris opened this issue Mar 25, 2021 · 23 comments · Fixed by #1180

Comments

@akenmorris
Copy link
Contributor

$ python RunUseCase.py --use_case femur --groom_images

...
########### Centering ###############
Input filename: Output/femur/groomed/com_aligned/images/m03_L_1x_hip.isores.pad.com.nrrd
Output filename: Output/femur/groomed/centered/images/m03_L_1x_hip.isores.pad.com.center.nrrd
Input filename: Output/femur/groomed/com_aligned/images/m04_L_1x_hip.isores.pad.com.nrrd
.....
Input filename: Output/femur/groomed/com_aligned/images/n19_L_1x_hip.isores.pad.com.nrrd
Output filename: Output/femur/groomed/centered/images/n19_L_1x_hip.isores.pad.com.center.nrrd
Input filename: Output/femur/groomed/com_aligned/images/n19_R_1x_hip.reflect.isores.pad.com.nrrd
Output filename: Output/femur/groomed/centered/images/n19_R_1x_hip.reflect.isores.pad.com.center.nrrd
zsh: segmentation fault python RunUseCase.py --use_case femur --groom_images

This was on MacOS with RC10.

@archanasri
Copy link
Contributor

Does it crash with tiny_test?

@akenmorris
Copy link
Contributor Author

The tiny_test does not crash.

@jadie1
Copy link
Contributor

jadie1 commented Mar 25, 2021

Might have something to do with the reflected femurs then because the tiny test only has left femurs.

@akenmorris
Copy link
Contributor Author

Update: I double checked and it seem to work on Linux.

@archanasri
Copy link
Contributor

I tried running tiny_test with 2 left femurs and one right femur. No issues there.

@akenmorris
Copy link
Contributor Author

akenmorris commented Mar 25, 2021

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x000000016b8612ad libvnl_algo.dylib`vnl_qr<double>::vnl_qr(this=0x00007ffeefbfd568, M=0x00007ffeefbfd6f0) at vnl_qr.hxx:51:24 [opt]
    frame #1: 0x000000016b85cf1d libvnl_algo.dylib`double vnl_determinant<double>(M=<unavailable>, balance=<unavailable>) at vnl_determinant.hxx:107:14 [opt]
    frame #2: 0x00000003dec807ac _ITKIOImageBasePython.so`___lldb_unnamed_symbol9961$$_ITKIOImageBasePython.so + 252
    frame #3: 0x00000003deba7ff4 _ITKIOImageBasePython.so`___lldb_unnamed_symbol7056$$_ITKIOImageBasePython.so + 132
    frame #4: 0x00000003dec80586 _ITKIOImageBasePython.so`___lldb_unnamed_symbol9956$$_ITKIOImageBasePython.so + 38
    frame #5: 0x00000003dea1d188 _ITKIOImageBasePython.so`___lldb_unnamed_symbol1480$$_ITKIOImageBasePython.so + 1560
    frame #6: 0x00000003ded50753 _ITKIOImageBasePython.so`itk::ProcessObject::UpdateOutputInformation() + 351
    frame #7: 0x00000003dec7fec2 _ITKIOImageBasePython.so`___lldb_unnamed_symbol9945$$_ITKIOImageBasePython.so + 70
    frame #8: 0x00000003ded5b6f4 _ITKIOImageBasePython.so`itk::DataObject::Update() + 18
    frame #9: 0x000000041a65c177 _ITKCommonPython.so`___lldb_unnamed_symbol8759$$_ITKCommonPython.so + 58
    frame #10: 0x000000010002c843 python`_PyMethodDef_RawFastCallKeywords + 131
    frame #11: 0x000000010002c1d6 python`_PyObject_FastCallKeywords + 598
    frame #12: 0x0000000100164bb7 python`call_function + 455
    frame #13: 0x000000010015c604 python`_PyEval_EvalFrameDefault + 20180
    frame #14: 0x0000000100155f04 python`_PyEval_EvalCodeWithName + 532
    frame #15: 0x000000010002c5e3 python`_PyFunction_FastCallKeywords + 403
    frame #16: 0x0000000100164aa7 python`call_function + 183
    frame #17: 0x000000010015c604 python`_PyEval_EvalFrameDefault + 20180
    frame #18: 0x000000010002c535 python`_PyFunction_FastCallKeywords + 229
    frame #19: 0x0000000100164aa7 python`call_function + 183
    frame #20: 0x000000010015cdb0 python`_PyEval_EvalFrameDefault + 22144
    frame #21: 0x0000000100155f04 python`_PyEval_EvalCodeWithName + 532
    frame #22: 0x000000010002c5e3 python`_PyFunction_FastCallKeywords + 403
    frame #23: 0x0000000100164aa7 python`call_function + 183
    frame #24: 0x000000010015c604 python`_PyEval_EvalFrameDefault + 20180
    frame #25: 0x0000000100155f04 python`_PyEval_EvalCodeWithName + 532
    frame #26: 0x00000001001c1afb python`PyRun_FileExFlags + 235
    frame #27: 0x00000001001c14c6 python`PyRun_SimpleFileExFlags + 502
    frame #28: 0x00000001001ede30 python`pymain_run_file + 160
    frame #29: 0x00000001001ed72b python`pymain_run_filename + 123
    frame #30: 0x00000001001ecf11 python`pymain_run_python + 145
    frame #31: 0x00000001001ecb8b python`pymain_main + 27
    frame #32: 0x00000001000018c9 python`main + 89
    frame #33: 0x00007fff6aedfcc9 libdyld.dylib`start + 1
    frame #34: 0x00007fff6aedfcc9 libdyld.dylib`start + 1

Could this be related to #1168 ?

@akenmorris
Copy link
Contributor Author

Why is ITKIOImageBasePython.so showing up in this stack trace? I thought we were were using shapeworks python bindings (e.g. Image class)?

@akenmorris
Copy link
Contributor Author

Also, some warnings lldb spit out just before it crashed:

2021-03-25 11:57:01.360622-0600 python[22532:5953417] dynamic_cast error 1: Both of the following type_info's should have public visibility. At least one of them is hidden. N3itk10DataObjectE, N3itk5ImageIfLj3EEE.
2021-03-25 11:57:01.360667-0600 python[22532:5953417] dynamic_cast error 2: One or more of the following type_info's has hidden visibility or is defined in more than one translation unit. They should all have public visibility. N3itk10DataObjectE, N3itk5ImageIfLj3EEE, N3itk9ImageBaseILj3EEE.
2021-03-25 11:57:01.905525-0600 python[22532:5953417] dynamic_cast error 1: Both of the following type_info's should have public visibility. At least one of them is hidden. N3itk10DataObjectE, N3itk5ImageIfLj3EEE.
2021-03-25 11:57:01.905556-0600 python[22532:5953417] dynamic_cast error 2: One or more of the following type_info's has hidden visibility or is defined in more than one translation unit. They should all have public visibility. N3itk10DataObjectE, N3itk5ImageIfLj3EEE, N3itk9ImageBaseILj3EEE.
2021-03-25 11:57:02.113545-0600 python[22532:5953417] dynamic_cast error 1: Both of the following type_info's should have public visibility. At least one of them is hidden. N3itk10DataObjectE, N3itk5ImageIfLj3EEE.
2021-03-25 11:57:02.113577-0600 python[22532:5953417] dynamic_cast error 2: One or more of the following type_info's has hidden visibility or is defined in more than one translation unit. They should all have public visibility. N3itk10DataObjectE, N3itk5ImageIfLj3EEE, N3itk9ImageBaseILj3EEE.
2021-03-25 11:57:02.646500-0600 python[22532:5953417] dynamic_cast error 1: Both of the following type_info's should have public visibility. At least one of them is hidden. N3itk10DataObjectE, N3itk5ImageIfLj3EEE.
2021-03-25 11:57:02.646531-0600 python[22532:5953417] dynamic_cast error 2: One or more of the following type_info's has hidden visibility or is defined in more than one translation unit. They should all have public visibility. N3itk10DataObjectE, N3itk5ImageIfLj3EEE, N3itk9ImageBaseILj3EEE.
2021-03-25 11:57:03.535838-0600 python[22532:5953417] dynamic_cast error 1: Both of the following type_info's should have public visibility. At least one of them is hidden. N3itk10DataObjectE, N3itk5ImageIfLj3EEE.
2021-03-25 11:57:03.535870-0600 python[22532:5953417] dynamic_cast error 2: One or more of the following type_info's has hidden visibility or is defined in more than one translation unit. They should all have public visibility. N3itk10DataObjectE, N3itk5ImageIfLj3EEE, N3itk9ImageBaseILj3EEE.
2021-03-25 11:57:05.224328-0600 python[22532:5953417] dynamic_cast error 1: Both of the following type_info's should have public visibility. At least one of them is hidden. N3itk10DataObjectE, N3itk5ImageIfLj3EEE.
2021-03-25 11:57:05.224362-0600 python[22532:5953417] dynamic_cast error 2: One or more of the following type_info's has hidden visibility or is defined in more than one translation unit. They should all have public visibility. N3itk10DataObjectE, N3itk5ImageIfLj3EEE, N3itk9ImageBaseILj3EEE.

@jadie1
Copy link
Contributor

jadie1 commented Mar 25, 2021

Is this happening in the centering step in the left atrium use case as well?

@cchriste
Copy link
Contributor

cchriste commented Mar 25, 2021 via email

@akenmorris
Copy link
Contributor Author

Update. Ok, I was confused by the Python output, I assumed it was still on the center step since there was no further output. It's crashing in FindReferenceImage:

    dim = itk.GetArrayFromImage(itk.imread(inDataList[i])).shape

My guess is that the problem is that we have two different ITK libraries loaded in memory. 👎

@akenmorris akenmorris changed the title Crash on centering (femur --groom_images) Crash with "femur --groom_images" on MacOS Mar 25, 2021
@akenmorris
Copy link
Contributor Author

Long ago and far away (the branch might no longer exist) I tried to build itk’s Python bindings along with itk in build_dependencies, but I was not successful. I suspect it’s worth another go, especially since conda/pip is notably installing different versions of some, but not all, itk python components (5.0 and 5.1), a blatant recipe for trouble.

I agree that might fix this issue, but what about #1168. Do we then need to build itkwidgets from scratch and provide that as well so that it's built with our ITK?

@cchriste
Copy link
Contributor

cchriste commented Mar 25, 2021

I agree that might fix this issue, but what about #1168. Do we then need to build itkwidgets from scratch and provide that as well so that it's built with our ITK?

If this is a shared library issue, maybe we can set LD_LIBRARY_PATH before running the notebook.
update: I tried this and the result was the same
update2: I only tried the #1168 use case, not this one.

@cchriste
Copy link
Contributor

cchriste commented Mar 26, 2021

It's crashing in FindReferenceImage:

    dim = itk.GetArrayFromImage(itk.imread(inDataList[i])).shape

ShapeWorks is not involved in this call. I wonder if more information could be cleaned by running this through pdb:
python -m pub RunUseCase.py --use_case femur --groom_images (then press 'r' to run)

(I'm running this now. How long does it take to crash? Has anyone been able to reduce this?)
update: it takes this long (~an hour), but less helpful is that it only reports a seg fault and doesn't even leave you in the debugger. I don't see a call to an external scripts, so I'm confused by that.

@akenmorris
Copy link
Contributor Author

The problem seems to be that the python ITK library is resolving and calling our VXL library rather than the one it was built with:

  * frame #0: 0x000000016b8612ad libvnl_algo.dylib`vnl_qr<double>::vnl_qr(this=0x00007ffeefbfd568, M=0x00007ffeefbfd6f0) at vnl_qr.hxx:51:24 [opt]
    frame #1: 0x000000016b85cf1d libvnl_algo.dylib`double vnl_determinant<double>(M=<unavailable>, balance=<unavailable>) at vnl_determinant.hxx:107:14 [opt]
    frame #2: 0x00000003dec807ac _ITKIOImageBasePython.so`___lldb_unnamed_symbol9961$$_ITKIOImageBasePython.so + 252
    frame #3: 0x00000003deba7ff4 _ITKIOImageBasePython.so`___lldb_unnamed_symbol7056$$_ITKIOImageBasePython.so + 132

Recall that we build our ITK with an alternate VXL/VNL, not the ITK one. libvnl_algo.dylib is ours, _ITKIOImageBasePython.so is from conda.

We're mixing different versions, so it's not surprising that it's crashing.

@cchriste
Copy link
Contributor

cchriste commented Mar 26, 2021

@archanasri and I just tried reversing the order of includes (again, testing #1168), and it fails.
I think the solution to the #1168 issue is for us to simply build itkwidgets (we're trying that now).
For this issue, can we point itk to our vnl like we point it to our eigen?

@akenmorris
Copy link
Contributor Author

We already have ITK using our VXL.

-DITK_USE_SYSTEM_VXL=on -DVXL_DIR=${INSTALL_DIR}

@archanasri
Copy link
Contributor

Update. Ok, I was confused by the Python output, I assumed it was still on the center step since there was no further output. It's crashing in FindReferenceImage:

    dim = itk.GetArrayFromImage(itk.imread(inDataList[i])).shape

My guess is that the problem is that we have two different ITK libraries loaded in memory. 👎

Instead of using python itk, we can use shapeworks.Image.toArray(). So I replaced

dim = itk.GetArrayFromImage(itk.imread(inDataList[i])).shape

with

img = Image(inDataList[i])
tmp = img.toArray()
dim = tmp.shape

It does not crash now.

@akenmorris
Copy link
Contributor Author

Nice @archanasri. Yeah, I figured we could get around it that way. We basically need to make sure we don't use python itk + our itk at the same time. I think the itkwidgets problem is the bigger problem though since I assume it uses the ITK python interface internally.

@archanasri
Copy link
Contributor

Yeah and we are not able to build itkwidgets.
We need to find a way for itkwidgets to use our itk.

@cchriste
Copy link
Contributor

cchriste commented Mar 26, 2021

Yeah and we are not able to build itkwidgets.
We need to find a way for itkwidgets to use our itk.

In the itkwidgets 0.32.0 (latest tagged version) setup.py it specifies these requirements:

    'install_requires': [
        'colorcet>=2.0.0',
        'itk-core>=5.1.0.post2',
        'itk-filtering>=5.1.0.post2',
        'itk-meshtopolydata>=0.6.2',
        'ipydatawidgets>=4.0.1',
        'ipywidgets>=7.5.1',
        'ipympl>=0.4.1',
        'matplotlib',
        'numpy',
        'six',
        'zstandard',
    ],

For itk, I think this puts us back at building the python.
Another way to approach this would be to ensure these are the versions installed by pip (mostly) and conda.
The versions of all of these on my system are all newer.

Here is what I have:

(shapeworks) cam@ananda:~/code/ShapeWorks/ShapeWorks/Examples/Python$ conda list | grep itk
itk                       5.0.1                    pypi_0    pypi
itk-core                  5.1.2                    pypi_0    pypi
itk-filtering             5.1.2                    pypi_0    pypi
itk-io                    5.0.1                    pypi_0    pypi
itk-meshtopolydata        0.6.3                    pypi_0    pypi
itk-numerics              5.1.2                    pypi_0    pypi
itk-registration          5.0.1                    pypi_0    pypi
itk-segmentation          5.0.1                    pypi_0    pypi
itkwidgets                0.32.0                   pypi_0    pypi
(shapeworks) cam@ananda:~/code/ShapeWorks/ShapeWorks/Examples/Python$ conda list | grep ipy
brotlipy                  0.7.0           py37hf967b71_1001    conda-forge
ipycanvas                 0.8.2                    pypi_0    pypi
ipydatawidgets            4.2.0                    pypi_0    pypi
ipyevents                 0.8.2                    pypi_0    pypi
ipykernel                 5.5.0            py37he01cfaa_1    conda-forge
ipympl                    0.7.0                    pypi_0    pypi
ipython                   7.21.0           py37he01cfaa_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipyvtk-simple             0.1.4                    pypi_0    pypi
ipywidgets                7.6.3                    pypi_0    pypi

One red flag is the 5.0.1 versions of itk-io, -registration, -segmentation, and most of all, itk itself.

@cchriste
Copy link
Contributor

I updated the pip installs to latest (there was some reason we couldn't do this, but putting that aside for now) and rebuilt all dependencies. #1168 still crashes in all the same ways described therein.

@akenmorris
Copy link
Contributor Author

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants