Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On MacOSX, Mac M Hardware (ARM), a segmentation fault happened with YDF when pyarrow is installed #79

Open
lusis-ai opened this issue Mar 7, 2024 · 8 comments

Comments

@lusis-ai
Copy link

lusis-ai commented Mar 7, 2024

Setup : MacOSX 13 or 14, Mac M hardware

Prerequisite : Install miniforge3

% conda create --name ydfpandasissue
% conda activate ydfpandasissue
% conda install python=3.10
% conda install pandas
% pip install ydf-0.2.0-cp310-cp310-macosx_13_0_arm64.whl

When running this program (ydf_test.py), it works.

import ydf
import pandas as pd
import numpy as np

dataset = {
    "x1": np.array([0, 0, 0, 1, 1, 1]),
    "x2": np.array([1, 1, 0, 0, 1, 1]),
    "y": np.array([0, 0, 0, 0, 1, 1]),
}

model = ydf.CartLearner(label="y", min_examples=1, task=ydf.Task.CLASSIFICATION).train(dataset)
print(model.describe())

Now install pyarrow from conda or pip the result is the same: it fails
Only the error message is different.

% conda install pyarrow
% python ydf_test.py
zsh: segmentation fault  python ydf_test.py
% conda uninstall pyarrow
% pip install pyarrow
% python ydf_test.py
libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
zsh: abort      python ydf_test.py

Note that pyarrow is mandatory when we work on big tabular dataset stored in parquet files.

@rstz
Copy link
Collaborator

rstz commented Mar 7, 2024

Thank you for the detailed report, I will have a look

@lusis-ai
Copy link
Author

lusis-ai commented Mar 7, 2024

Similar issue happened with tensorflow_decision_forests.

After installing tensorflow and tensorflow_decision_forests from pip (as tfdf for ARM on conda is not available), in the same config as above, the following error happened (here python terminal):

Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:35:25) [Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow_decision_forests as tfdf
>>> import ydf
[mutex.cc : 453] RAW: Lock blocking 0x600001892898   @

@mowoe
Copy link

mowoe commented Mar 8, 2024

I had the same issue, but i failed to make the connection to ydf.
As a temporary workaround, i switched to fastparquet, which is the other library pandas supports to read parquet files. This one works fine for me.

@lusis-ai
Copy link
Author

lusis-ai commented Mar 8, 2024

But the issue is still there when importing tensorflow or tensorflow_decision_forests.

We have utils libs importing tensorflow so it makes it crash with :

libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
zsh: abort      python ydf_test.py

@mowoe
Copy link

mowoe commented Mar 8, 2024

python -c 'import pandas;import tensorflow;import tensorflow_decision_forests'

works fine in my venv which only has fastparquet installed and not pyarrow

@rstz
Copy link
Collaborator

rstz commented Mar 8, 2024

To give some preliminary findings from the crash logs:

  • A protobuf incompatibility is the source of the crash. Both (conda-installed) Pyarrow and (pip-installed) YDF depend on protobuf.
  • AFAICT, pyarrow links dynamically against libprotobuf 25.2. Unfortunately, there's a symbol overlap, where libprotobuf calls ydf. ydf has protobuf24.3 statically linked. Since the two versions don't match, there's a crash
  • There has to be an easy way to prevent this mess during compilation (suggestions anyone?) - I imagine one would just have to instruct ydf not to expose protobuf symbols that confuse libprotobuf
  • Very dirty solution (attached, UNTESTED very experimental): If you compile ydf with protobuf 25.2 it actually seems to work. But we obviously cannot keep ydf in sync with every protobuf that's out there. ydf-0.2.0-cp310-cp310-macosx_14_0_arm64.whl.zip
  • I might also be that this is a conda-specific type of issue. I'd strongly prefer not maintaining a conda package alongside a pip package at this point though
  • I'll have to look into TF-DF separately

@lusis-ai
Copy link
Author

lusis-ai commented Mar 8, 2024

Nice, we manage package consistency with conda but inside a conda env we can also install packages with pip when needed. Tomorrow I will try by using pip only to check.

Thanks for your help

@lusis-ai
Copy link
Author

lusis-ai commented Mar 9, 2024

Hi,

For the issue with pyarrow, thanks to your indication it's resolved just by forcing protobuf to 4.24.3, even installing protobuf with conda is ok and now it works.

The strange thing is that, even if ydf has protobuf24.3 statically linked, pip install the very last 4.25.3 version. There is no strict requirement to force the protobuf version to 4.24.3 when installing ydf from pip, just protobuf>=3.14, maybe it should be modified ?

Anyway, by doing it manually, it works now.

Not the same issue for TF-DF, it still crash, so I cannot use model.to_tensorflow_saved_model(path) function.

copybara-service bot pushed a commit that referenced this issue Mar 12, 2024
This might resolve some compatibility issues, e.g. #79

PiperOrigin-RevId: 615051068
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants