Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault when trying to fetch dataset #17

Open
mika-data opened this issue Feb 4, 2023 · 15 comments
Open

segmentation fault when trying to fetch dataset #17

mika-data opened this issue Feb 4, 2023 · 15 comments

Comments

@mika-data
Copy link

mika-data commented Feb 4, 2023

root@server:~/Downloads# python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 11:59:11 2023) in /usr/local/lib/python3.9/dist-packages/sling


root@server:~/Downloads# sling fetch --dataset caspar
[2023-02-04 11:59:34.807193: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 11:59:34.813720: I run.py:341] Execute command fetch
*** Signal 11 (Segmentation fault) at 0x0000020b26f0 for 0x0000020b26f0
  @ 0x0000020b26f0 (unknown)

**Segmentation fault**

root@server:~/Downloads# cat /etc/*release
PRETTY_NAME="**Debian GNU/Linux 11** (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
@mika-data
Copy link
Author

My fault, I have just downloaded the Python API via pip, yet.

The command line interpreter will probably work only for a small subset of commands.

@ringgaard
Copy link
Owner

@mika-data: Did you try to build the Python API yourself on your Debian machine? I normally build on Ubuntu, but I would think the differences are minor.

@mika-data
Copy link
Author

No, I have downloaded the Python API as a whl as recommended in your installation documentation.

I had previously only python3.9 on my machine, after I had build python3.6 and then build sling from source, everything seems to work for me.

root@server:/usr# python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 13:50:30 2023) in /usr/local/python-3.6.15/lib/python3.6/site-packages/sling
root@cgnvision:/usr# sling fetch --dataset caspar
[2023-02-04 13:51:48.997469: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 13:51:49.007196: I run.py:341] Execute command fetch
[2023-02-04 13:51:49.008710: I sling/task/job.cc:349] All systems GO
[2023-02-04 13:51:49.008867: I sling/task/job.cc:62] Starting stage #0
[2023-02-04 13:51:49.008945: I sling/task/job.cc:66] Start url-download
[2023-02-04 13:51:49.009773: I download.py:51] Download caspar from https://ringgaard.com/data/caspar/caspar.flow
[2023-02-04 13:51:49.009979: I download.py:78] Start download of ./data/e/caspar/caspar.flow
[2023-02-04 13:51:49.741937: I download.py:94] caspar downloaded
[2023-02-04 13:51:49.742027: I sling/task/job.cc:402] Task url-download completed
[2023-02-04 13:51:49.742188: I sling/task/job.cc:407] Task url-download done
[2023-02-04 13:51:49.742255: I sling/task/job.cc:419] Stage #0 done
[2023-02-04 13:51:49.743633: I workflow.py:821] sending final status to monitor
[2023-02-04 13:51:49.743902: I run.py:351] Done

@ringgaard
Copy link
Owner

Hmm, maybe I should test this on Python 3.9. My Ubuntu only has 3.8.

@mika-data
Copy link
Author

I have tested it on another debian machine. There it worked fine:

(wikidata) mika@server:~/Programming/wikidata$ pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl
Collecting sling==3.0.0
  Downloading https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl (7.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.4/7.4 MB 3.9 MB/s eta 0:00:00
Installing collected packages: sling
Successfully installed sling-3.0.0
(wikidata) mika@server:~/Programming/wikidata$ python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 20:58:48 2023) in /home/mika/anaconda3/envs/wikidata/lib/python3.8/site-packages/sling
(wikidata) mika@server:~/Programming/wikidata$ sling fetch --dataset caspar
[2023-02-04 20:59:03.350186: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 20:59:03.354802: I run.py:341] Execute command fetch
[2023-02-04 20:59:03.355687: I sling/task/job.cc:349] All systems GO
[2023-02-04 20:59:03.355815: I sling/task/job.cc:62] Starting stage #0
[2023-02-04 20:59:03.355821: I sling/task/job.cc:66] Start url-download
[2023-02-04 20:59:03.356144: I download.py:51] Download caspar from https://ringgaard.com/data/caspar/caspar.flow
[2023-02-04 20:59:03.356218: I download.py:78] Start download of ./data/e/caspar/caspar.flow
[2023-02-04 20:59:05.813746: I download.py:94] caspar downloaded
[2023-02-04 20:59:05.813851: I sling/task/job.cc:402] Task url-download completed
[2023-02-04 20:59:05.814257: I sling/task/job.cc:407] Task url-download done
[2023-02-04 20:59:05.814305: I sling/task/job.cc:419] Stage #0 done
[2023-02-04 20:59:05.816389: I workflow.py:821] sending final status to monitor
[2023-02-04 20:59:05.816896: I run.py:351] Done
(wikidata) mika@blackbrain:~/Programming/wikidata$ cat /etc/*release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
(wikidata) mika@server:~/Programming/wikidata$ python -V
Python 3.8.16

@ringgaard
Copy link
Owner

So it seems to work on Python 3.8, but fail on Python 3.9, right?

@mika-data
Copy link
Author

Yes, I can confirm the bug on a second machine:

mika@server:Downloads$ pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl
Collecting sling==3.0.0
  Using cached https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl (7.4 MB)
Installing collected packages: sling
Successfully installed sling-3.0.0
mika@server:Downloads$ python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 21:48:31 2023) in /home/mika/.local/lib/python3.9/site-packages/sling
mika@server:Downloads$ sling fetch --dataset caspar
[2023-02-04 21:48:43.254199: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 21:48:43.256684: I run.py:341] Execute command fetch
*** Signal 11 (Segmentation fault) at 0x0000019d0660 for 0x0000019d0660
  @ 0x0000019d0660 (unknown)
**Speicherzugriffsfehler** <---- segmentation fault
mika@server:Downloads$ py -V
Python 3.9.2

@ringgaard
Copy link
Owner

Let me try to see if I can reproduce this on one of my own machines.

@ringgaard
Copy link
Owner

I can now reproduce the crash. It seems to have something to do with Python type registration in the pysling C extension when running in Python 3.9.

@ringgaard
Copy link
Owner

I seems like you need to build pysling.so using python3.9-dev for it to work with Python 3.9, so I have added support for building pysling.so for Python 3.9. You change DPYVER=36 to DPYVER=39 and rebuild using tools/buildall.sh. It seems like the 3.9 version can be used with earlier versions of Python, but I haven't change the default yet because I don't have Python 3.9 on all my machines that build the code.

@meerfrau
Copy link

When I compile from source against Python 3.10 and don't dockerize (pip/venv/...) you might get:

Compiling sling/pyapi/pyapi.cc failed: (Exit 1): gcc failed: error executing command (from target //sling/pyapi:pyapi) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 25 arguments skipped)
In file included from ./sling/pyapi/pyarray.h:19,
                 from sling/pyapi/pyapi.cc:17:
./sling/pyapi/pybase.h: In static member function 'static sling::Text sling::PyBase::GetText(PyObject*)':
./sling/pyapi/pybase.h:130:37: error: invalid conversion from 'const char*' to 'char*' [-fpermissive]
  130 |       data = PyUnicode_AsUTF8AndSize(obj, &length);
      |              ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
      |                                     |
      |                                     const char*

Sadly I don't know much C++, but isn't this just a point of permitting the type conversion?

@ringgaard
Copy link
Owner

Are you using the newest version of the code? Line 130 of pybase.h does not match your error message.

Are there any reason that you cannot use the pre-built wheel?

@meerfrau
Copy link

meerfrau commented Apr 20, 2023

I've changed pybase to:

#include <python3.10/Python.h>
#include <python3.10/structmember.h>

Are there any reason that you cannot use the pre-built wheel?

To see the error ;)

@meerfrau
Copy link

meerfrau commented Apr 20, 2023

I'm sorry, the current sources work perfectly against Python 3.10!

PS: Installed via sudo ln -s ./sling/python /usr/lib/python3.10/site-packages/sling → may you please add a setup.py for people like me?

@ringgaard
Copy link
Owner

@meerfrau: I use wheels instead of setuptools, so you can install SLING with the following command:

sudo pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl

I have updated the code to support Python 3.10 by changing DPYVER=36 to DPYVER=310.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants