Can we have short-circuit equivalents to `allclose`, `array_equal`, `isfinite` and the like in numpy? #6909

congma · 2015-12-30T17:40:26Z

I did some (admittedly very dirty) benchmark of scipy.linalg.cho_solve(), and what surprised me was the magnitude of overhead due to check_finite=True, which is on by default. My notes can be read here.

The finiteness-check is useful, if not plain necessary for some scenarios, but can we do better? In many user cases such as the scipy.linalg functions, all that matters is that some element of an array is NaN or infinity. Now, what it has to do, is to call numpy.asarray_chkfinite, which in turn does something amounting to numpy.isfinite(a).all(). But this does not short-circuit; isfinite() checks each single element and returns a Boolean array. This whole test can be expensive, but can it be made less expensive in the best case?

This issue is also present in the test of closeness and equality, in core/numeric.py. Neither allclose() nor array_equal() actually short-circuits when doing the real check. They only short-circuit in the all() function/method call, which is already too late. These two functions can be especially deceiving. Spoiled by Python's built-in all(), the user may think these functions do the short-circuit evaluation. The docs never mention they don't.

So the request is... Do you think it's worthwhile to add some short-circuit Boolean functions for the above tests (finiteness/equality/closeness)?

The text was updated successfully, but these errors were encountered:

njsmith · 2015-12-30T21:19:49Z

IIRC there are some cases where np.all does abort early as an optimization, but in general it's somewhat difficult (consider calls that use a non-trivial axis= argument). And it's speaker irrelevant to chkfinite in any case, because here the case where speed matters is the case where all values are finite, and so we have to check the whole array. (If there are non-finite values, we will throw an error, and it doesn't matter much how quickly we throw an error.)

In general though if there's some code in numpy that can be made faster then we do like to receive patches to make things more efficient, and especially if this is creating a bottleneck for you (I can't tell from your post whether allclose is a practical problem, or just a theoretical one).

congma · 2015-12-30T22:09:04Z

The example in my notebook linked in OP wasn't the most relevant one, although by that one I discovered the (shocking to me) fact that most of the time cho_solve is simply testing for NaNs. A more relevant example is testing array exact equality for use in dictionaries.

Sometimes I need to use numpy arrays as dictionary keys (for example, caching). Of course, there are complications. For one thing, arrays must not change value during their lifetime as keys. We then need a hash function (to be applied on array buffers, again with complications) and a comparison function that tests exact equality. The latter function is used by Python for open addressing in case of hash collision.

If hash collision is fairly common, the comparison function's performance could be pretty critical. It needs to bail out as early as possible during probing, especially when the key is large.

Currently, numpy.array_equal evaluates too eagerly for this. The desirable behavior, for this particular case, is to short-circuit the comparison.

seberg · 2015-12-30T22:14:35Z

I think Nathaniel is right, at least for your case, short circuiting does not help at all. Possible there is some vectorization/low level optimization that can be done.
Your example seems to me mostly memory bandwidth bound (since cfac is too large to fit into cache). From that point of view any, even non-shortcircuiting, combination of isfinite + all or isfinite + any might safe a good bit on out-of-cache sized arrays.
(On my cpu the factor is more like 1.5 I think, but still proves how optimized blas is also)

The most general way of implemeting it, might be adding .all and .any attributes to ufuncs returning a single boolean (so that np.isfinite.all(arr) would work). It would probably be quite a chunk of work though, and "short-circuiting" is actually really tricky with the axis keyword argument as Nathaniel said (it goes against the memory order optimization numpy usually does). Anyway, it could be a nice new feature, standalone or part of the ufuncs.

rgommers · 2015-12-31T06:47:09Z

I discovered the (shocking to me) fact that most of the time cho_solve is simply testing for NaNs. A more relevant example is testing array exact equality for use in dictionaries.

It can be expensive, this is why check_finite was added as a keyword so you can disable it if you're sure that your input cannot contain NaNs. The keyword is ugly though, which is a bit of an annoyance. If the check was cheaper, the keyword could be removed.

nouiz · 2016-01-08T02:14:23Z

In Theano we did a faster version for nan check.

old nan check(from memory): np.isnan().any()

faster version: np.isnan(np.min(arr))

Executing the reduction first is faster as we do the isnan check on less
element. This use the property of nan that they propagate.

We use for inf check, I didn't do it myself, so I don't remember the speed
of it:

np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))
We didn't tried the combination of both, but I think a version like the
fast nan check would do it. Operation between inf will give inf or nan.
Operation between inf and nan will give nan. I didn't test it, but I think
this would do it:

numpy.isfinite(numpy.min(arr))

Le 31 déc. 2015 01:47, "Ralf Gommers" notifications@github.com a écrit :

I discovered the (shocking to me) fact that most of the time cho_solve is
simply testing for NaNs. A more relevant example is testing array exact
equality for use in dictionaries.

It can be expensive, this is why check_finite was added as a keyword so
you can disable it if you're sure that your input cannot contain NaNs. The
keyword is ugly though, which is a bit of an annoyance. If the check was
cheaper, the keyword could be removed.

—
Reply to this email directly or view it on GitHub
#6909 (comment).

juliantaylor · 2016-01-08T18:52:05Z

something like hasnan could be useful as it skips creating a bool array, but its not going to be faster than @nouiz np.min trick which also only gives you about 30% better speed.

short circuiting is probably not very useful in practice as in the common case of no nan you can't short circuit and in the other you have to add expensive operations to get rid of them.

The best you can do is make use of the fact that operations raise exceptions when using nans, so catch that and handle it. But that only works in cases where something is done before going into lapack which will deadlock or crash (which is the reason for the checks existence int he first place)

juliantaylor · 2016-01-08T19:24:31Z

that said np.isfinite is not vectorized so there could be a 1.5x-2x speedup available, at least for cached data

juliantaylor · 2016-01-09T15:17:38Z

vectorized isfinite in gh-6980
Its not very significant for the above testcase as that is mostly memory bandwidth bound, but nice speedup for cached arrays

congma · 2016-01-09T16:47:28Z

Thanks for the updates. The min trick is interesting; It's even documented! ;)

Right now, for caching, I'm resorting to directly hashing the data attribute, and in case of collision, comparing the data attribute by Python's == operator. The seem to work fast enough (bottleneck being the hash computation), but for very restricted collections of arrays under comparison. For my usage all the possible arrays being cached have the same type and shape, and there's no overlapping of underlying memory.

jakirkham · 2018-07-26T23:37:01Z

So it looks like PR ( #6980 ) went in, but was then reverted. Unclear as to whether it was readded later or not. Does anyone know?

seberg · 2021-02-24T22:13:28Z

I am going to close this. There are two possibilities:

Specialized gufuncs for performance optimiation to replace some chained operation: ENH: Create a place for "optimization" ufuncs #18483
A chaining operation map_reduce like addition to the ufunc machinery (or addition build on top of it): ENH: Create the ability for fused operations (fused ufuncs or map_reduce) style #11622 (and other issues like it)

Both probably make sense for certain functions.

rgommers added 01 - Enhancement component: numpy._core labels Dec 31, 2015

rth mentioned this issue Jul 25, 2018

Optimizing assert_all_finite check scikit-learn/scikit-learn#11681

Closed

seberg closed this as completed Feb 24, 2021

MatthewFlamm mentioned this issue Jul 25, 2022

Make MultiBlock meshes MutableSequence with some dict-like features for setting and getting. pyvista/pyvista#3031

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we have short-circuit equivalents to `allclose`, `array_equal`, `isfinite` and the like in numpy? #6909

Can we have short-circuit equivalents to `allclose`, `array_equal`, `isfinite` and the like in numpy? #6909

congma commented Dec 30, 2015

njsmith commented Dec 30, 2015

congma commented Dec 30, 2015

seberg commented Dec 30, 2015

rgommers commented Dec 31, 2015

nouiz commented Jan 8, 2016

juliantaylor commented Jan 8, 2016

juliantaylor commented Jan 8, 2016

juliantaylor commented Jan 9, 2016

congma commented Jan 9, 2016

jakirkham commented Jul 26, 2018

seberg commented Feb 24, 2021

Can we have short-circuit equivalents to allclose, array_equal, isfinite and the like in numpy? #6909

Can we have short-circuit equivalents to allclose, array_equal, isfinite and the like in numpy? #6909

Comments

congma commented Dec 30, 2015

njsmith commented Dec 30, 2015

congma commented Dec 30, 2015

seberg commented Dec 30, 2015

rgommers commented Dec 31, 2015

nouiz commented Jan 8, 2016

juliantaylor commented Jan 8, 2016

juliantaylor commented Jan 8, 2016

juliantaylor commented Jan 9, 2016

congma commented Jan 9, 2016

jakirkham commented Jul 26, 2018

seberg commented Feb 24, 2021

Can we have short-circuit equivalents to `allclose`, `array_equal`, `isfinite` and the like in numpy? #6909

Can we have short-circuit equivalents to `allclose`, `array_equal`, `isfinite` and the like in numpy? #6909