Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_Template_with_model_2D fails on aarch64 #902

Open
ggardet opened this issue Jun 20, 2023 · 10 comments
Open

test_Template_with_model_2D fails on aarch64 #902

ggardet opened this issue Jun 20, 2023 · 10 comments

Comments

@ggardet
Copy link

ggardet commented Jun 20, 2023

test_Template_with_model_2D fails on openSUSE Tumbleweed aarch64 with:

[  162s] =================================== FAILURES ===================================
[  162s] _________________________ test_Template_with_model_2D __________________________
[  162s] 
[  162s]     @pytest.mark.skipif(not scipy_available, reason="scipy.stats is needed")
[  162s]     def test_Template_with_model_2D():
[  162s]         truth1 = (1.0, 0.1, 0.2, 0.3, 0.4, 0.5)
[  162s]         x1, y1 = mvnorm(*truth1[1:]).rvs(size=int(truth1[0] * 1000), random_state=1).T
[  162s]         truth2 = (1.0, 0.2, 0.1, 0.4, 0.3, 0.0)
[  162s]         x2, y2 = mvnorm(*truth2[1:]).rvs(size=int(truth2[0] * 1000), random_state=1).T
[  162s]     
[  162s]         x = np.append(x1, x2)
[  162s]         y = np.append(y1, y2)
[  162s]         w, xe, ye = np.histogram2d(x, y, bins=(3, 5))
[  162s]     
[  162s]         def model(xy, n, mux, muy, sx, sy, rho):
[  162s]             return n * 1000 * mvnorm(mux, muy, sx, sy, rho).cdf(np.transpose(xy))
[  162s]     
[  162s]         x3, y3 = mvnorm(*truth2[1:]).rvs(size=int(truth2[0] * 10000), random_state=2).T
[  162s]         template = np.histogram2d(x3, y3, bins=(xe, ye))[0]
[  162s]     
[  162s]         cost = Template(w, (xe, ye), (model, template))
[  162s]         assert cost.ndata == np.prod(w.shape)
[  162s]         m = Minuit(cost, *truth1, 1)
[  162s]         m.limits["x0_n", "x0_sx", "x0_sy"] = (0, None)
[  162s]         m.limits["x0_rho"] = (-1, 1)
[  162s]         m.migrad()
[  162s]         assert m.valid
[  162s] >       assert_allclose(m.values, truth1 + (1e3,), rtol=0.1)
[  162s] 
[  162s] tests/test_cost.py:1362: 
[  162s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  162s] 
[  162s] args = (<function assert_allclose.<locals>.compare at 0xffffa2185af0>, array([1.99892738, 0.15563798, 0.14635371, 0.36308616, 0.36053843,
[  162s]        0.18792633, 2.69653611]), array([1.e+00, 1.e-01, 2.e-01, 3.e-01, 4.e-01, 5.e-01, 1.e+03]))
[  162s] kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=0.1, atol=0', 'verbose': True}
[  162s] 
[  162s]     @wraps(func)
[  162s]     def inner(*args, **kwds):
[  162s]         with self._recreate_cm():
[  162s] >           return func(*args, **kwds)
[  162s] E           AssertionError: 
[  162s] E           Not equal to tolerance rtol=0.1, atol=0
[  162s] E           
[  162s] E           Mismatched elements: 6 / 7 (85.7%)
[  162s] E           Max absolute difference: 997.30346389
[  162s] E           Max relative difference: 0.99892738
[  162s] E            x: array([1.998927, 0.155638, 0.146354, 0.363086, 0.360538, 0.187926,
[  162s] E                  2.696536])
[  162s] E            y: array([1.e+00, 1.e-01, 2.e-01, 3.e-01, 4.e-01, 5.e-01, 1.e+03])
[  162s] 
[  162s] /usr/lib64/python3.9/contextlib.py:79: AssertionError
@HDembinski
Copy link
Member

Thanks for the report. This is quite annoying, because there is no easy fix in sight.

@stephanlachnit
Copy link
Contributor

Also on PowerPC in Debian: https://buildd.debian.org/status/fetch.php?pkg=iminuit&arch=ppc64el&ver=2.22.0-2&stamp=1688553620&raw=0

E           AssertionError: 
E           Not equal to tolerance rtol=0.1, atol=0
E           
E           Mismatched elements: 6 / 7 (85.7%)
E           Max absolute difference: 997.26724243
E           Max relative difference: 0.99903804
E            x: array([1.999038, 0.155588, 0.146298, 0.363122, 0.360575, 0.187985,
E                  2.732758])
E            y: array([1.e+00, 1.e-01, 2.e-01, 3.e-01, 4.e-01, 5.e-01, 1.e+03])

Interestingly with quite identical numbers

HDembinski added a commit that referenced this issue Jul 27, 2023
Closes #913 see that issue for a detailed analysis by @stephanlachnit
why the original naive code did not work everywhere.

Maybe this is related to #902, too.
@HDembinski
Copy link
Member

@ggardet Please try again with the latest version.

@ggardet
Copy link
Author

ggardet commented Aug 4, 2023

@ggardet Please try again with the latest version.

Still failing with version 2.23.0:

[  526s] =================================== FAILURES ===================================
[  526s] _________________________ test_Template_with_model_2D __________________________
[  526s] 
[  526s]     def test_Template_with_model_2D():
[  526s]         truth1 = (1.0, 0.1, 0.2, 0.3, 0.4, 0.5)
[  526s]         x1, y1 = mvnorm(*truth1[1:]).rvs(size=int(truth1[0] * 1000), random_state=1).T
[  526s]         truth2 = (1.0, 0.2, 0.1, 0.4, 0.3, 0.0)
[  526s]         x2, y2 = mvnorm(*truth2[1:]).rvs(size=int(truth2[0] * 1000), random_state=1).T
[  526s]     
[  526s]         x = np.append(x1, x2)
[  526s]         y = np.append(y1, y2)
[  526s]         w, xe, ye = np.histogram2d(x, y, bins=(3, 5))
[  526s]     
[  526s]         def model(xy, n, mux, muy, sx, sy, rho):
[  526s]             return n * 1000 * mvnorm(mux, muy, sx, sy, rho).cdf(np.transpose(xy))
[  526s]     
[  526s]         x3, y3 = mvnorm(*truth2[1:]).rvs(size=int(truth2[0] * 10000), random_state=2).T
[  526s]         template = np.histogram2d(x3, y3, bins=(xe, ye))[0]
[  526s]     
[  526s]         cost = Template(w, (xe, ye), (model, template))
[  526s]         assert cost.ndata == np.prod(w.shape)
[  526s]         m = Minuit(cost, *truth1, 1)
[  526s]         m.limits["x0_n", "x0_sx", "x0_sy"] = (0, None)
[  526s]         m.limits["x0_rho"] = (-1, 1)
[  526s]         m.migrad()
[  526s]         assert m.valid
[  526s] >       assert_allclose(m.values, truth1 + (1e3,), rtol=0.1)
[  526s] 
[  526s] tests/test_cost.py:1397: 
[  526s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  526s] 
[  526s] args = (<function assert_allclose.<locals>.compare at 0xffff8120fd30>, array([1.99892738, 0.15563798, 0.14635371, 0.36308616, 0.36053843,
[  526s]        0.18792633, 2.69653611]), array([1.e+00, 1.e-01, 2.e-01, 3.e-01, 4.e-01, 5.e-01, 1.e+03]))
[  526s] kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=0.1, atol=0', 'verbose': True}
[  526s] 
[  526s]     @wraps(func)
[  526s]     def inner(*args, **kwds):
[  526s]         with self._recreate_cm():
[  526s] >           return func(*args, **kwds)
[  526s] E           AssertionError: 
[  526s] E           Not equal to tolerance rtol=0.1, atol=0
[  526s] E           
[  526s] E           Mismatched elements: 6 / 7 (85.7%)
[  526s] E           Max absolute difference: 997.30346389
[  526s] E           Max relative difference: 0.99892738
[  526s] E            x: array([1.998927, 0.155638, 0.146354, 0.363086, 0.360538, 0.187926,
[  526s] E                  2.696536])
[  526s] E            y: array([1.e+00, 1.e-01, 2.e-01, 3.e-01, 4.e-01, 5.e-01, 1.e+03])
[  526s] 
[  526s] /usr/lib64/python3.9/contextlib.py:79: AssertionError

@HDembinski
Copy link
Member

@ggardet Is there a way for me to get terminal access to an aarch64 machine to debug this? I cannot fix it without more information.

I suspect that this problem occurs on any linux aarch64 platform. We run the iminuit tests on each release on ubuntu aarch64 emulated via qemu, but we skip a lot of tests which require optional extra dependencies like scipy. This test happens to be one of those that we skip, so this issue was probably never discovered.

@ggardet
Copy link
Author

ggardet commented Aug 4, 2023

@ggardet Is there a way for me to get terminal access to an aarch64 machine to debug this? I cannot fix it without more information.

I suspect that this problem occurs on any linux aarch64 platform. We run the iminuit tests on each release on ubuntu aarch64 emulated via qemu, but we skip a lot of tests which require optional extra dependencies like scipy. This test happens to be one of those that we skip, so this issue was probably never discovered.

Unfortunately, I cannot provide remote access to the build machines.
But, you can use qemu to emulate an aarch64 system. On openSUSE Tumbleweed aarch64, you will have the required dependencies available.

@HDembinski
Copy link
Member

Ok, I will try that.

@stephanlachnit
Copy link
Contributor

@ggardet Is there a way for me to get terminal access to an aarch64 machine to debug this? I cannot fix it without more information.

@HDembinski I can ask if you can get a Debian guest account with machine access for a limited time to inspect this issue (https://dsa.debian.org/doc/guest-account/). We also have arm64 (search for "porterbox" purpose on https://db.debian.org/machines.cgi). Just drop me an email (stephanlachnit@debian.org) if you would be interested.

@ggardet
Copy link
Author

ggardet commented Feb 7, 2024

Any update on this issue?

@HDembinski
Copy link
Member

HDembinski commented Feb 8, 2024

No update, because I cannot properly test this with my own setup. I am grateful for the friendly offer by @stephanlachnit , but I don't have the time and energy right now to go through the process of learning how to test this on a debian machine.

I could disable the test on linux aarch64 to make the error go away as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants