Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SegFault when using jnius with numpy #490

Open
AbdealiLoKo opened this issue Mar 3, 2020 · 6 comments
Open

SegFault when using jnius with numpy #490

AbdealiLoKo opened this issue Mar 3, 2020 · 6 comments

Comments

@AbdealiLoKo
Copy link
Contributor

AbdealiLoKo commented Mar 3, 2020

This ticket is similar to numpy/numpy#15691
As I am not sure where the issue exactly is - jnius or numpy?
Using jnius+numpy causes issues in CentOS7 when python is installed with anaconda.

Reproducing code example:

I can reproduce this exactly with:

$ docker run --rm -it centos:7 /bin/bash
# yum install -y wget bzip2 which java-1.8.0-openjdk-devel
# wget https://repo.anaconda.com/miniconda/Miniconda3-4.7.12-Linux-x86_64.sh
# bash ./Miniconda3-4.7.12-Linux-x86_64.sh -b
# /root/miniconda3/bin/pip install pyjnius==1.2.1 numpy==1.17.4
# /root/miniconda3/bin/python
>>> import jnius
>>> import numpy as np
>>> tmp = np.linalg.inv(np.random.rand(24, 24))
Segmentation fault

Error message:

GDB Traceback for the SegFault:

(gdb) r
Starting program: /root/miniconda3/bin/python run.py
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Detaching after fork from child process 533.
Detaching after fork from child process 535.
Detaching after fork from child process 536.

Program received signal SIGSEGV, Segmentation fault.
0x00007f3026ee82b4 in ?? ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.3.x86_64 java-1.8.0-openjdk-headless-1.8.0.242.b08-0.el7_7.x86_64
(gdb) bt
#0  0x00007f3026ee82b4 in ?? ()
#1  0x0000000000000246 in ?? ()
#2  0x00007f3026ee8160 in ?? ()
#3  0x00007f303829a6f8 in Abstract_VM_Version::_reserve_for_allocation_prefetch ()
   from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64/jre/lib/amd64/server/libjvm.so
#4  0x00007fff1187df20 in ?? ()
#5  0x00007f3037d804dd in VM_Version::get_processor_features() ()
   from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64/jre/lib/amd64/server/libjvm.so
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

Numpy/Python version information:

  • Numpy - 1.17.4
  • Jnius - 1.2.1
  • Python - 3.7.4 / Anaconda 4.7.12
  • GCC - 7.3.0
  • Linux x86-64 - CentOS 7

Note: It is reproducible on

  • Anaconda 4.5.4 / Python 3.6.5 / GCC 7.2.0
  • CentOS7's yum python3 package (3.6.8) / GCC 4.8.5

It is not reproducible on MacOS.

@seberg
Copy link

seberg commented Mar 3, 2020

Just to note that in numpy/numpy#15691 we tracked it down to be probably related to threading. I.e. setting OMP_NUM_THREADS=1 which disables the threading in openblas (or MKL if that is used here) makes things pass. It would be nice to know whether this is more likely an issue with jnius, an issue with OpenBLAS, or a general clash without an easy solution.

@tshirtman
Copy link
Member

tshirtman commented Apr 25, 2020

Sorry for lack of answer, it's a bit obscure indeed, and no idea really came to mind.

  • can't reproduce in ubuntu 18.04, with latest numpy built from source and latest pyjnius from source, but can reproduce with numpy installed from wheel (and jnius from source), and can't reproduce with numpy from source but jnius release.
  • as centos7 is pretty old (but yes, used to build manylinux packages, it's a pita), it might be something about the old version of libraries in it, i tried with latest pyjnius built in centos, but that didn't fix it, building numpy there is less straightforward, but i have no particular reason to think that recent versions of it would fix it.

the fact that it's numpy building from source that fixes is a slight (and wishful) hint to me that the proplem lies on numpy's build, rather than jnius, but i can't really be sure.

@seberg
Copy link

seberg commented Apr 25, 2020

@tshirtman note that if you run on ubuntu, I am not sure that you get openblas by default right now, it might be atlas, or even no blas at all.
So it might be you are not using a (blas) threading layer (i.e. similar to OMP_NUM_THREADS=1). I expect that this whole thing has to do with the multiple threading layers, interaction with java(?) threads and python/blas created threads.

@tshirtman
Copy link
Member

hm, but if i install from the wheel on ubuntu, i do get the segfault. Just not when i build numpy from source, though it's possible that it doesn't build with blas if i don't have it, i didn't get any warning about it though, i'll check.

@seberg
Copy link

seberg commented Apr 25, 2020

You would not get a warning, check np.__config__.show() that shows you what is picked up for compliation, so probably what a non-wheel would use. The wheel ships with OpenBLAS.

EDIT: You may e.g. have a blas intalled, but that may just be a link to atlas

@tshirtman
Copy link
Member

tshirtman commented Apr 25, 2020

built from source

>>> numpy.show_config()
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
  NOT AVAILABLE
atlas_3_10_blas_threads_info:
  NOT AVAILABLE
atlas_3_10_blas_info:
  NOT AVAILABLE
atlas_blas_threads_info:
  NOT AVAILABLE
atlas_blas_info:
  NOT AVAILABLE
accelerate_info:
  NOT AVAILABLE
blas_info:
    libraries = ['blas', 'blas']
    library_dirs = ['/usr/lib/x86_64-linux-gnu']
    include_dirs = ['/usr/local/include', '/usr/include', '/home/gabriel/.virtualenvs/jnius37/include']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]
    libraries = ['blas', 'blas']
    library_dirs = ['/usr/lib/x86_64-linux-gnu']
    include_dirs = ['/usr/local/include', '/usr/include', '/home/gabriel/.virtualenvs/jnius37/include']
    language = c
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
  NOT AVAILABLE
openblas_clapack_info:
  NOT AVAILABLE
flame_info:
  NOT AVAILABLE
atlas_3_10_threads_info:
  NOT AVAILABLE
atlas_3_10_info:
  NOT AVAILABLE
atlas_threads_info:
  NOT AVAILABLE
atlas_info:
  NOT AVAILABLE
lapack_info:
    libraries = ['lapack', 'lapack']
    library_dirs = ['/usr/lib/x86_64-linux-gnu']
    language = f77
lapack_opt_info:
    libraries = ['lapack', 'lapack', 'blas', 'blas']
    library_dirs = ['/usr/lib/x86_64-linux-gnu']
    language = c
    define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]
    include_dirs = ['/usr/local/include', '/usr/include', '/home/gabriel/.virtualenvs/jnius37/include']

from wheel:

>>> numpy.show_config()
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants