You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
EDIT - to clarify, Soeren is of course correct. The below is what seems to be happening, even though it shouldn't.
If you look at the prange semantics here and search for "allocation hoisting", you can see that the array allocated by np.zeros is the same object across the loop iterations (maybe only between the same CPU, which would explain why entries 0 and 1 are the only ones affected?), and is actually only zeroed out by the call to np.zeros rather than being reallocated.
Replacing np.zeros(...) with np.zeros(...).copy() removes the issue for me.
To fix your issue, you should avoid treating objects returned by np.zeros (and similar) as genuinely new objects when created within a prange. If you create all the new objects before the prange starts, that might work better?
Not sure if this is still a bug in numba or not though - the behaviour does seem pretty unexpected and not-transparent.
From my understanding of the documentation, allocation hoisting is supposed to only be done when it doesn't change the semantics of the code, i.e., the link you posted is not a description of how semantics change in the presence of prange, but a description of techniques that numba uses to make it more efficient.
See the context of the paragraph regarding allocation hoisting:
Loop invariant code motion is an optimization technique that analyses a loop to look for statements that can be moved outside the loop body without changing the result of executing the loop, these statements are then “hoisted” out of the loop to save repeated computation.
...
Allocation hoisting is a specialized case of loop invariant code motion that is possible due to the design of some common NumPy allocation methods
soerenwolfers
changed the title
Wrong list result with parallel=True despite no resizing and no cross-access
Wrong list result with parallel=True despite no resizing and no cross-thread access of the list
Apr 15, 2024
Reporting a bug
visible in the release notes
(https://numba.readthedocs.io/en/stable/release-notes-overview.html).
i.e. it's possible to run as 'python bug.py'.
prints
The output of the
weird
function is aList[(Dict, array)]
, and each array is supposed to just hold the corresponding loop index.In the non-parallel execution the zeroth entry of the output list is
(somedict, [0.0])
, as it should.In the parallel execution, the zeroth entry of the out output list is incorrectly equal to the first entry of the output list,
(somedict, [1.0])
.This stops happening if
9
to an8
(on my laptop with 8 logical cores)Dict
component of theoutputs
tuples.PS: I'd appreciate if you know of a way to circumvent this in the meantime. In my real problem, I have quite the messy datatype in the list.
Python version: '3.12.2 (main, Feb 25 2024, 16:36:57) [GCC 9.4.0]'
numba version: 0.59.1
numpy version: 1.26.4
OS: NAME="Ubuntu" VERSION="20.04.6 LTS (Focal Fossa)"
The text was updated successfully, but these errors were encountered: