Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bus error upon existing the program after using spacy on mac M1 #13204

Open
koder-ua opened this issue Dec 19, 2023 · 7 comments
Open

bus error upon existing the program after using spacy on mac M1 #13204

koder-ua opened this issue Dec 19, 2023 · 7 comments
Labels
osx Issues related to macOS / OSX

Comments

@koder-ua
Copy link

How to reproduce the behaviour

On M1 any code, which uses spacy to parse a doc failing with (can only test on my laptop)
Works fine on linux machine

[1] 73089 bus error

upon exit. On both sm and trf models

Your Environment

Info about spaCy

  • spaCy version: 3.7.2
  • Platform: macOS-14.1.2-arm64-arm-64bit
  • Python version: 3.11.3
  • Pipelines: en_core_web_sm (3.7.0), en_core_web_md (3.7.1), en_core_web_trf (3.7.2), en_core_web_lg (3.7.0)
@koder-ua
Copy link
Author

Here is some binary tb info

https://gist.github.com/koder-ua/8fd3e3fd795674b01d1ddbeda9400999

@adrianeboyd
Copy link
Contributor

Thanks for the report!

The info provided makes this look specific to the trf model, in particular curated-tokenizers. If you have a minute, could you create a new venv without installing torch and with only the en_core_web_sm model and see if you still get the same error?

@adrianeboyd adrianeboyd added the osx Issues related to macOS / OSX label Dec 20, 2023
@koder-ua
Copy link
Author

koder-ua commented Dec 20, 2023

@adrianeboyd yep, seems like you right
on clean python3.11 with only spacy & en_core_web_sm installed all works fine

python3.11 with only spacy and en_core_web_sm

~ python -c 'import spacy; npl = spacy.load("en_core_web_sm"); npl("some text")'
~

python3.11 with pytorch & co

✗ python -c 'import spacy; npl = spacy.load("en_core_web_sm"); npl("some text")'
[1]    54694 bus error  python -c

Yet just installing trf model (which also installs torhc & co) did not cause the issue to appear:

(python311_clean) ➜  ~ python -c 'import spacy; npl = spacy.load("en_core_web_sm"); npl("some text")'
(python311_clean) ➜  ~ python -c 'import spacy; npl = spacy.load("en_core_web_trf"); npl("some text")'
(python311_clean) ➜  ~

@adrianeboyd
Copy link
Contributor

If you also install sentencepiece in the new venv?

@koder-ua
Copy link
Author

All fine

(python311_clean) ➜  ~ pip install sentencepiece
Collecting sentencepiece
  Downloading sentencepiece-0.1.99-cp311-cp311-macosx_11_0_arm64.whl (1.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 19.0 MB/s eta 0:00:00
Installing collected packages: sentencepiece
Successfully installed sentencepiece-0.1.99
(python311_clean) ➜  ~ python -c 'import spacy; npl = spacy.load("en_core_web_trf"); npl("I have some text")'
(python311_clean) ➜  ~ python -c 'import spacy; npl = spacy.load("en_core_web_sm"); npl("I have some text")'
(python311_clean) ➜  ~

@adrianeboyd
Copy link
Contributor

In general this seems to be a known issue related to sentencepiece, which is vendored in curated-tokenizers. I'm not currently sure exactly which conditions are necessary for you to run into it in practice, though.

@danieldk
Copy link
Contributor

I think this is the same issue as google/sentencepiece#579 . I am not sure though why the sentencepiece library is loaded. We link sentencepiece statically.

At any rate, the error comes from destructing absl::Flag. However absl:Flag is not needed for library-use of sentencepiece, but tends to creep back in as a dependency. I'll see if we can remove it in curated-tokenizers, which should avoid conflicts between different versions of sentencepiece.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
osx Issues related to macOS / OSX
Projects
None yet
Development

No branches or pull requests

3 participants