Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug in WCR #1435

Open
BenWeber42 opened this issue Nov 20, 2023 · 2 comments
Open

Potential bug in WCR #1435

BenWeber42 opened this issue Nov 20, 2023 · 2 comments

Comments

@BenWeber42
Copy link
Contributor

There was a test failure during a normal push to master run (we run CI for every push to master which includes merging a PR):
https://github.com/spcl/dace/actions/runs/6906483658/job/18791617536#step:6:2301

The affected test runs this DaCe program:

@dace.program
def augassign_wcr4():
a = np.zeros((10,))
for i in dace.map[1:9]:
a[i-1] += 1
a[i] += 2
a[i+1] += 3
return a

Which gets tested here:

def test_augassign_wcr4():
with dace.config.set_temporary('frontend', 'avoid_wcr', value=False):
val = augassign_wcr4()
ref = augassign_wcr4.f()
assert np.allclose(val, ref)

The failure is summarized by this log excerpt:

FAILED tests/python_frontend/augassign_wcr_test.py::test_augassign_wcr4 - assert False
 +  where False = <function allclose at 0x7fe7a396e4f0>(array([1., 3., 6., 6., 5., 6., 6., 6., 5., 3.]), array([1., 3., 6., 6., 6., 6., 6., 6., 5., 3.]))
 +    where <function allclose at 0x7fe7a396e4f0> = np.allclose

Here, clearly the 5th value of the left array (val coming from the DaCe generated program) should be a 6 and not a 5.

This test passed previous runs and also the following run to master. It currently seems unlikely that a bug was temporarily introduced and later fixed. It seems more likely that we have a bug which only causes test failures rarely (if a race-condition gets triggered the right way).

A possible next step would be to look at the generated source for this DaCe program and verify if the source code should properly protect against a potential race-condition.

(I'm currently working on other things, but wanted to create this issue for tracking...)

@mcopik
Copy link
Contributor

mcopik commented Nov 27, 2023

@BenWeber42 My CI is also failing for no apparent reason: https://github.com/spcl/dace/actions/runs/6985515742/job/19009857416?pr=1444

image

@edopao
Copy link
Collaborator

edopao commented Dec 12, 2023

@BenWeber42 Same issue observed in my CI job:
https://github.com/spcl/dace/actions/runs/7171060109/job/19525191544?pr=1471

=========================== short test summary info ============================
FAILED tests/python_frontend/augassign_wcr_test.py::test_augassign_wcr4 - assert False
 +  where False = <function allclose at 0x7fa53797ee70>(array([1., 3., 5., 6., 6., 6., 6., 6., 5., 3.]), array([1., 3., 6., 6., 6., 6., 6., 6., 5., 3.]))
 +    where <function allclose at 0x7fa53797ee70> = np.allclose
==== 1 failed, 2139 passed, 55 skipped, 7097 warnings in 1827.42s (0:30:27) ====

I retriggered the job and it passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants