New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failures for gges
and qz
for float32 input in macOS CI
#16949
Comments
The one change in that build is conda-forge/openblas-feedstock#146. Most of that is |
At first sight these tests are writting correctly, but it's not clear whether this is a bug in OpenBLAS or in SciPy. We've had a lot of problems with |
Note that |
On
The |
How do I run a specific test locally from
|
Depends on how you've installed or built SciPy. If a dev build according to the docs with
If you have SciPy installed, e.g. from conda-forge, then you can use
|
On arm64, with the newer builds of openblas (with only the automatic rerendering changes) from here: conda-forge/openblas-feedstock#147 I can reproduce this error by
but not the other.
Environment:
|
I am going to upload all the artifacts to my personal channel. I can then test them in a CI (I no longer have access to osx-64 machines easily at home). Done: https://anaconda.org/ngam/openblas/files ( |
Testing the osx-64 here: https://github.com/ngam/test_openblas_scipy |
It seems only some of the tests could be reproduced. Not sure what's going on. @rgommers feel free to copy my repo or I can add you to it to edit as you wish https://github.com/ngam/test_openblas_scipy |
The x86-64 build log has a bunch of
No idea if those are relevant though, such warnings are par for the course for old Fortran code. I think the last fix in this set of wrappers was gh-13397. @ilayn any thoughts here? |
If it was a wrapper issue we would probably see a segfault rather than precision problem And precision violation is ever so little increased . So I am cautiously looking at OpenBLAS but can't be sure. @martin-frbg is there anything happened regarding |
@ilayn nothing comes to mind except the general update to Reference-LAPACK 3.10.1 - do you see these problems only on mac ? |
@martin-frbg yes, they started appearing on macOS when gfortran was upgraded from version |
xref scipygh-16949. This doesn't fix the root cause of the problem, but stops the CI job from failing. [skip azp] [skip circle]
xref scipygh-16949. This doesn't fix the root cause of the problem, but stops the CI job from failing. [skip azp] [skip circle]
The issue in CI was fixed by |
Seems like everything being tested passes now, but perhaps with two |
The failing tests were skipped. You could try backing out this change and testing again |
Trying now on the M1 in the GCC compile farm and pytest/scipy/openblas installed via pip. Both pytest runs pass with either the original (0.3.18) libopenblas.0.dylib or my own build from current |
test_lapack failures now reproduced on my macbook with the conda 0.3.21, have not checked older or newer version yet. (as an aside, running just "conda install numpy" without additional qualifiers appears to select a mkl build on this platform) |
That macbook is sloooow, but current indication is that the issue is already fixed in current |
Thanks. So if I understand correctly, this may be solved by packaging an interim OpenBLAS 0.3.21+ off the |
Yes, and/or get conda-forge to rebuild their openblas package with -fno-tree-vectorize set for gfortran. |
Once #17950 goes in, it could be followed up by using the recently uploaded v0.3.20-571-g3dec11c6, which is the current HEAD of develop. It is hash |
Oops, I must have forgotten to do the mergeback from the release branch then, sorry. (Not sure if it is still safe to do now, or if it is just one more reason to get 0.3.22 out ASAP) |
I think changing it now has more potential to do damage than to help, I just wanted to explain why the new lib build has an old name. Thanks for digging into this. |
Now that #17950 is merged, it should be easier to try to upgrade OpenBLAS to |
I noticed that those OpenBLAS v0.3.20 tarballs were available. I don't understand what's going on with the versioning. Why are those tarballs available when 0.3.21 should be more recent? |
I tried to explain the versioning above. TL;DR: |
see mattip's post above #16949 (comment) - I messed up by forgetting to merge the 0.3.21 tag from the release-0.3 branch back onto |
Does numpy/scipy have a policy on using non release versions of OpenBLAS? I have a |
Any version that works is fine I think, no policy beyond that. A dev version with a fix we need is also not much different from picking a release version and applying some patches to it. |
I re-enabled the tests in A few things came to my attention during the build phase:
It would be nice if the latter were fixed upstream. I don't know about the former. I'll make a PR to update the OpenBLAS version. Hopefully there are no problems with any other platform/machine. |
The boost warnings will probably go away after the switch to
That's bad, perhaps something went wrong with code signing in that OpenBLAS build? |
The lexical_cast warnings are fixed in the next boost release. To install the OpenBLAS bundle I simply expanded the tarball and copied I don't know how anything is signed in scipy world. There is a whole bunch of stuff one can do to notarise/sign/etc macOS binaries. Do we do anything along those lines for the openblas libs, or indeed the wheels we make? |
#18012 was merged. Did that solve the issue? |
Yes, it was solved for 1.11, but won't be backported to the 1.10 branch. |
These started showing up consistently over the last day (example log). Only in the macOS tests (meson) job. That job uses conda-forge, the root cause is very likely the new build (0.23.1 build
_3
) of OpenBLAS pushed to https://anaconda.org/conda-forge/openblas about 16 hours ago.The text was updated successfully, but these errors were encountered: