Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: meson: implement BLAS/LAPACK auto-detection and many CI jobs #24899

Merged
merged 6 commits into from Oct 11, 2023

Conversation

charris
Copy link
Member

@charris charris commented Oct 10, 2023

Backport of #24893.

This reimplements the auto-detection and many of the switches that numpy.distutils offered. Beyond that, it implements several new features:

  • Auto-detect the symbol suffix for ILP64 OpenBLAS (can be none or 64_)

  • MKL ILP64 support, threading control, and use of the layered library model for MKL >=2023.0

  • FlexiBLAS support (LP64 and ILP64)

  • Support for the upcoming standard in Reference LAPACK for _64 ILP64 symbol suffix convention.

  • A test suite for BLAS/LAPACK libraries, covering:

    • OpenBLAS: LP64, ILP64 detected via pkg-config and with a "system dependency" (i.e., custom code inside Meson)
    • MKL: LP64, ILP64 (layered) and LP64 (SDL)
    • Accelerate: LP64, ILP64 on macOS >=13.3
    • FlexiBLAS: LP64, ILP64 on Fedora
    • ATLAS (LP64, via pkg-config only)
    • BLIS (LP64, via pkg-config only)
    • plain libblas/liblapack (Netlib, LP64 only)

The list of libraries that is tried with the default 'auto' setting excludes a couple of libraries, because they're either no longer developed (ATLAS), not mature (libflame), or can't easily be tested and may be re-added later (ArmPL, ssl2). Those libraries can still be quite easily used via pkg-config.

The new CI jobs are running by default right now. Once things settle down, the plan is to disable them by default and allow triggering them via a [blas ci] command in the commit message (just like for wheel builds).

Docs will be included in a separate PR with the pending rewrite of all the build/install docs. For now, the CI jobs and the meson_options.txt file serve as guidance for how to use this.

Note that the test suite contains a few hacks, because of packaging bugs for MKL on PyPI (broken .pc files) and BLIS (missing .pc file in Debian).

Closes gh-24808
Closes gh-24846
Addresses the most important remaining task from the introduction of Meson (see gh-23981).

@charris charris added 01 - Enhancement 08 - Backport Used to tag backport PRs 36 - Build Build related PR Meson Items related to the introduction of Meson as the new build system for NumPy labels Oct 10, 2023
@charris charris added this to the 1.26.1 release milestone Oct 10, 2023
@rgommers
Copy link
Member

@charris you may want to squash-merge the two commits and only keep the commit message of the first one - including [wheel build] on the first line.

@charris
Copy link
Member Author

charris commented Oct 10, 2023

Oddly, none of the build_tests ran.

That was because the second commit disabled them.

…eel build]

This reimplements the auto-detection and many of the switches that
`numpy.distutils` offered. Beyond that, it implements several new
features:

- Auto-detect the symbol suffix for ILP64 OpenBLAS (can be none or
  `64_`)
- MKL ILP64 support, threading control, and use of the layered library
  model for MKL >=2023.0
- FlexiBLAS support (LP64 and ILP64)
- Support for the upcoming standard in Reference LAPACK for `_64` ILP64
  symbol suffix convention.
- A test suite for BLAS/LAPACK libraries, covering:

    - OpenBLAS: LP64, ILP64 detected via pkg-config and with a "system
                dependency" (i.e., custom code inside Meson)
    - MKL: LP64, ILP64 (layered) and LP64 (SDL)
    - Accelerate: LP64, ILP64 on macOS >=13.3
    - FlexiBLAS: LP64, ILP64 on Fedora
    - ATLAS (LP64, via pkg-config only)
    - BLIS (LP64, via pkg-config only)
    - plain libblas/liblapack (Netlib, LP64 only)

The list of libraries that is tried with the default 'auto' setting
excludes a couple of libraries, because they're either no longer
developed (ATLAS), not mature (libflame), or can't be tested and
may be re-added later (ArmPL, ssl2). Those libraries can still be
quite easily used via pkg-config.

The new CI jobs are running by default right now. Once things settle
down, the plan is to disable them by default and allow triggering
them via a `[blas ci]` command in the commit message (just like for
wheel builds).

Docs will be included in a separate PR with the pending rewrite of
all the build/install docs. For now, the CI jobs and the
`meson_options.txt` file serve as guidance for how to use this.

Note that the test suite contains a few hacks, because of packaging
bugs for MKL on PyPI (broken .pc files) and BLIS (missing .pc file in
Debian).
@charris
Copy link
Member Author

charris commented Oct 11, 2023

I think both of these fail because they use ILP64

  • Build_Test / blas64 (pull_request)
  • Build_Test / relaxed_strides_debug (pull_request)

We could probably delete them, but it would be nice to know where things go wrong.

Not clear why travis is failing, I don't think it should be affected.
Not clear why linux_musl is failing, it doesn't differ from main.
The azure-pipeline failure is probably ILP64 again.

It would be good to keep travis until we figure out why #24814 is failing.

@rgommers
Copy link
Member

Okay, all the wheel builds are happy, and the failures are happening in jobs that aren't present on main anymore. I think all of them are due to using ILP64 in the setup.py build, which needs a tweak for the changes in npy_cblas.h. I hope that's the last time I'm looking at that setup.py code:)

@rgommers
Copy link
Member

Not clear why linux_musl is failing, it doesn't differ from main.

This is because the change to _distributor_init.py to support scipy-openblas wheels was not backported yet.

@rgommers
Copy link
Member

rgommers commented Oct 11, 2023

I don't know what's up with the test_warnings failures on Azure:

______________________________ test_warning_calls ______________________________

    @pytest.mark.slow
    def test_warning_calls():
        # combined "ignore" and stacklevel error
        base = Path(numpy.__file__).parent
    
        for path in base.rglob("*.py"):
            if base / "testing" in path.parents:
                continue
            if path == base / "__init__.py":
                continue
            if path == base / "random" / "__init__.py":
                continue
            # use tokenize to auto-detect encoding on systems where no
            # default encoding is defined (e.g. LANG='C')
            with tokenize.open(str(path)) as file:
                tree = ast.parse(file.read())
>               FindFuncs(path).visit(tree)

...

self = <numpy.tests.test_warnings.FindFuncs object at 0x131263340>
node = <ast.Call object at 0x1312a3e80>

    def visit_Call(self, node):
        p = ParseCall()
        p.visit(node.func)
        ast.NodeVisitor.generic_visit(self, node)
    
>       if p.ls[-1] == 'simplefilter' or p.ls[-1] == 'filterwarnings':
E       IndexError: list index out of range

node       = <ast.Call object at 0x1312a3e80>
p          = <numpy.tests.test_warnings.ParseCall object at 0x1312638b0>
self       = <numpy.tests.test_warnings.FindFuncs object at 0x131263340>

That does not seem related to the last two commits.

EDIT: oh wait, it is:

/home/runner/work/numpy/numpy/venv-for-wheel/lib/python3.9/site-packages/numpy/linalg/setup.py:54: SyntaxWarning: 'tuple' object is not callable; perhaps you missed a comma?
  ('BLAS_SYMBOL_SUFFIX', '64_')

@rgommers rgommers removed the 36 - Build Build related PR label Oct 11, 2023
@rgommers
Copy link
Member

The Musl thing would be much easier to diagnose with a decent traceback. @stefanv here is another example of spin swallowing tracebacks, as you asked for before. All that' is visible in the CI log is:

$ /__w/numpy/numpy/test_env/bin/python -m pytest --rootdir=/__w/numpy/numpy/build-install/usr/lib/python3.10/site-packages -n auto -m 'not slow' numpy
ImportError while loading conftest '/__w/numpy/numpy/build-install/usr/lib/python3.10/site-packages/numpy/conftest.py'.
numpy/__init__.py:135: in <module>
    raise ImportError(msg) from e
E   ImportError: Error importing numpy: you should not try to import numpy from
E           its source directory; please exit the numpy source tree, and relaunch
E           your python interpreter from there.
Error: Process completed with exit code 4.

If I reproduce the issue locally and cd into build-install/.../site-packages and run python -c "import numpy" I get the expected traceback:

$ cd build-install/usr/lib/python3.9/site-packages/
$ python -c "import numpy"
Traceback (most recent call last):
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/core/__init__.py", line 24, in <module>
    from . import multiarray
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/core/multiarray.py", line 10, in <module>
    from . import overrides
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/core/overrides.py", line 8, in <module>
    from numpy.core._multiarray_umath import (
ImportError: /home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so: undefined symbol: cblas_cdotc_sub

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/__init__.py", line 130, in <module>
    from numpy.__config__ import show as show_config
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/__config__.py", line 4, in <module>
    from numpy.core._multiarray_umath import (
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/core/__init__.py", line 50, in <module>
    raise ImportError(msg)
ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.9 from "/home/rgommers/mambaforge/envs/numpy-dev-noblas/bin/python"
  * The NumPy version is: "1.26.0"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: /home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so: undefined symbol: cblas_cdotc_sub


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/rgommers/code/numpy/build-install/usr/lib/python3.9/site-packages/numpy/__init__.py", line 135, in <module>
    raise ImportError(msg) from e
ImportError: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

This should fix the MUSL CI job failure after the Meson auto-BLAS/LAPACK
update.
@rgommers
Copy link
Member

rgommers commented Oct 11, 2023

All green now and the commits are all backports that look fine, so in it goes. Thanks Chuck!

I'll propose content for the release notes as well. EDIT: see gh-24902

@rgommers rgommers merged commit ff3187a into numpy:maintenance/1.26.x Oct 11, 2023
58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 08 - Backport Used to tag backport PRs Meson Items related to the introduction of Meson as the new build system for NumPy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants