New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
polyfit and eig regression tests fail after Windows 10 update to 2004 #16744
Comments
Edit: You are obviously using pip.I have also had a strange result on Windows AMD64 in the recent past with linear algebra libraries and eigenvalue decompositions (in the context of running test for statsmodels). If you have time, try using 32 bit Python and pip and see if you get the same issues? I couldn't see them on 32-bit windows but they were repeatable on 64-bit windows. If I use conda, which ships MKL, I don't have the errors. Edit: I also see them when using NumPy 1.18.5 on Windows AMD64. |
Were the tests failing before the update? |
No @charris , before the update the test suite just passes. |
@speixoto Do you know which update it was specifically? I'd be interested to see if it solves my issue with pip-installed wheels. |
Update 1809 (10.0.17763) was not causing any failed test @bashtage |
1909 wasn't causing anything as well. It only started happening with 2004. |
I'm not 100% convinced it is 2004 or something after. I think 2004 worked. FWIW I can't reproduce these crashes on CI (Azure or appveyor) but I do it is locally on 2 machines that are both AMD64 and update on 2004. Have either of you tried to see if you get them on 32-bit Python? |
Seems there have been a number of problems reported against the 2004 update. Maybe this should be reported also? |
I just ran the following on fresh install of 1909 and 2004:
On 1909 no failures. On 2004 30 failures all related to linear algebra functions. When I run tests on 2004 in a debugger, I notice that the first call to a function often returns in incorrect result, but calling again produces the correct result (which remains correct if repeatedly called). Not sure if this is useful information as to guessing a cause. |
Do earlier versions of NumPy also have problems? I assume you are running 1.19.0. |
Using pip + 1.18.4, and scipy 1.4.1, I get the same set of errors. These are really common:
|
When I run using 1.18.5 + MKL I get no errors. It is hard to say whether this is likely an OpenBLAS bug or a Windows bug. Probably the latter, but it will be really hard to get to and diagnosing is beyond my capabilities for low-level debugging. On the same physical machine, when I run in Ubuntu using pip packages I don't see any errors. |
One final update: if I test using a source build of NumPy without optimized BLAS I still see errors although they are not an identical set. |
Maybe worth pinging the OpenBLAS devs. Does it happen with float32 as often as float64? |
When I run a full test of NumPy 1.19.0,
|
I think if you only run the test exclusively it should pass if I remember correctly from pinging things around so that means even more strange behavior. |
I have filed with Microsoft the only way I know how: Posting in case others find this through search: Windows users should use Conda/MKL if on 2004 until this is resolved |
Here is a small reproducing example:
Produces
Does LAPACK count from 0 or 1? All of the illegal values appear to be integers: |
It is seeming more like an OpenBlas issue (or something between 2004 and OpenBLAS). Here is my summary: NumPy only
statsmodels testing
|
It would be nice to learn what changed in 2004. Maybe we need a different flag when compiling/linking the library? |
If it is an OpenBLAS bug, it is unlikely they will have seen it since none of the Windows-based CI are using build 19041 (aka Windows 10 2004) or later. |
Just to be clear, it it true that none of these reports involve WSL? |
No. All with either conda-provided |
Does the test fail if the environment variable |
I tried Atom, SandyBridge, Haswell, Prescott and Nehalem, all with identical results. |
The strangest thing is that if you run
the second (and any further) calls to |
There is a new comment from Steve Wishnousky from 5 Dez at fmod(), after an update to windows 2004, is causing a strange interaction with other code: An Insider Preview Windows build with the fix is now available. Build 20270 is now in the Dev Channel: announcing-windows-10-insider-preview-build-20270. If you are affected by this issue and want to test against the fix that will come via Windows Update, you can use this build. This is an insider build with other changes included, of course, so it is not entirely representative of the version that will be serviced, but it can be used to test for specific cases where the bug used to reproduce. We are still on schedule for servicing Windows build 19041 (the current release version) at the end of January 2021. |
Indeed, after updating to the latest Win 10 Dev Channel track (incl "Windows 10 Insider Preview 20270.1 (fe_release)"), Today this fix is available only in Dev Channel track, not yet in Beta nor Release Preview. |
Hopefully it will be downstream next month and this can be closed. |
I just updated from Windows 2004 to Windows 20H2 (OS Build: 19042.685), and unexpectedly, the import error under >>> import numpy as np
>>> a = np.arange(13 * 13, dtype=np.float64)
>>> a.shape = (13, 13)
>>> a = a % 17
>>> va, ve = np.linalg.eig(a)
>>> va
array([ 1.03221168e+02 +0.j , -1.91843603e+01 +0.j ,
1.82126812e+01 +0.j , -6.04004526e-01+15.84422474j,
-6.04004526e-01-15.84422474j, -1.13692929e+01 +0.j ,
-6.57612485e-01+10.41755503j, -6.57612485e-01-10.41755503j,
1.06011014e+01 +0.j , 7.80732773e+00 +0.j ,
-7.65390898e-01 +0.j , 2.87796761e-15 +0.j ,
-1.26447204e-15 +0.j ]) I then rechecked my Not sure what's happening, but thought I'd let people know... :) PS. >>> import sys, numpy; print(numpy.__version__, sys.version)
1.19.4 3.8.6 | packaged by conda-forge | (default, Dec 22 2020, 09:52:49) [MSC v.1916 64 bit (AMD64)] |
The date on the binary is probably the first time it was built after it was last modified (Windows uses a lot of caching between builds to try and keep the total build time under 24 hours). So it would seem like the fix has made its way out now. |
It isn't out yet. Only in development builds. Using
|
conda forge uses a newer OpenBlas, which is why @h-vetinari 's passes. The standard pip install continues to fail because the fmod patch hasn't been distributed yet. |
According to the above mention, the NumPy 1.19.5 release uses a workaround for the Windows 2004 bug, instead of waiting for Microsoft to fix it.
|
The final fix is nearing release and is now in beta and release channels. Hopefully out next month. Windows 10 build 19042.782
|
The fix is now available in KB4598291. |
Thanks @bashtage for the first concise reproducer for this, that made it relatively simple to pinpoint the problem. Closing. |
OpenBLAS issue on Windows 10 2004 has been fixed (see numpy/numpy#16744 and OpenMathLib/OpenBLAS#2709)
Tests are failing:
FAILED ....\lib\tests\test_regression.py::TestRegression::test_polyfit_build - numpy.linalg.LinAlgError: SVD did not...
FAILED ....\linalg\tests\test_regression.py::TestRegression::test_eig_build - numpy.linalg.LinAlgError: Eigenvalues ...
FAILED ....\ma\tests\test_extras.py::TestPolynomial::test_polyfit - numpy.linalg.LinAlgError: SVD did not converge i...
with exceptions:
and
Steps taken:
pip install pytest
pip install numpy
pip install hypothesis
Same happens issue happens when running on tests in the repository.
Version 1.19.0 of numpy
Am I missing any dependencies? Or is it just Windows going bonkers?
The text was updated successfully, but these errors were encountered: