Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gufunc behavior on incorrect out shape #16484

Closed
anirudh2290 opened this issue Jun 2, 2020 · 5 comments · Fixed by #16485
Closed

gufunc behavior on incorrect out shape #16484

anirudh2290 opened this issue Jun 2, 2020 · 5 comments · Fixed by #16485

Comments

@anirudh2290
Copy link
Member

From the discussion here : #15162 (comment)

The following example doesnt seem to raise an error when out shape is incorrect or broadcast the inputs to the output, but silently populates the out with result + garbage values.

import numpy as np
from numpy.core import _umath_tests as umt
from numpy.testing import assert_raises
a = np.arange(6).reshape(3, 2)
b = np.ones(2)
out = np.empty((5, 3))
umt.inner1d(a, b, out)
print(out)

Result:

[[ 1.00000000e+000  5.00000000e+000  9.00000000e+000]
 [ 6.91217735e-310 -1.45681599e+144 -1.45681599e+144]
 [-1.45681599e+144 -1.45681599e+144 -1.45681599e+144]
 [-1.45681599e+144 -1.45681599e+144 -1.45681599e+144]
 [-1.45681599e+144 -1.45681599e+144  6.32404027e-322]]
@seberg
Copy link
Member

seberg commented Jun 2, 2020

Hehe, was thinking about this for a while….Also ping @mhvk just in case.

The NPY_ITER_NO_BROADCAST flag, oddly enough seems almost always used in a way where you could simply take that operand fully out for shape discovery (e.g. as an output op). But, it would be an incompatible break if we would just use it to address that issue.

So I am considering adding a new NPY_ITER_OUTPUT_OPERAND. Or make an NPY_ITER_OUTPUT_OPERAND and another flag NPY_ITER_DOES_NOT_AFFECT_SHAPE so that the output operand flag can include both the "allocate" and the "no broadcast" flags.

It feels like a combination of allocate and no-broadcast, probably already somewhat implies this. But the question is if we want to make such a change, in theory it can affect external libraries using NpyIter. (Unless we make it a VisibleDeprecationWarning when it kicks in, but not sure that is super neat to implement.)

I think that could address the issue. There are interesting side cases, such as an axes which is only used by one (or more) output arrays. One could try to support that (at some point?), but it probably adds too much complexity for no real gain.
I simply don't really see much of a use-case for a (),()->(k) gufunc. Something like not (),()->(3,3) or ->(3,3) makes sense of course, but is no issue.

@mhvk
Copy link
Contributor

mhvk commented Jun 2, 2020

Bit confused about all the flags, I must admit. But a gufunc moments with signature ()->(k) could perhaps calculate moments up to k in the output. Of course, incredibly far-fetched, but my sense would be not to explicitly exclude it (nor code explicitly to allow it!).

@mhvk
Copy link
Contributor

mhvk commented Jun 2, 2020

A bit more directly relevant to the issue here, where the output has an inconsistent outer shape, is that there is precedent from the regular ufuncs:

np.add(1, 1, out=np.empty((3,)))
# array([2., 2., 2.])

This suggests also for gufuncs, the outer shape should be determined from all inputs and outputs. Obviously, will cause needless calculations, but so be it...

@seberg
Copy link
Member

seberg commented Jun 2, 2020

Oh, I somehow did not expect that to actually work. I suppose in that case we can simply allow the output to cause broadcasting in the inputs (and thus adjust the shape). Basically simply retaining the current behaviour.

so, just do nothing here, except for making the sure the (g)ufunc gets called multiple times (it will a bit of a slow way to get the result in many cases, but that is the callers problem). While I do not think it is important to support this, I also see no problems with doing it.

But we still need to fix this issue in gufuncs obviously!

@seberg
Copy link
Member

seberg commented Jun 2, 2020

Ah, its very simple:

diff --git a/numpy/core/src/umath/ufunc_object.c b/numpy/core/src/umath/ufunc_object.c
index 19876d641..85820e3a0 100644
--- a/numpy/core/src/umath/ufunc_object.c
+++ b/numpy/core/src/umath/ufunc_object.c
@@ -2614,7 +2614,7 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
      * dimensions.
      */
     broadcast_ndim = 0;
-    for (i = 0; i < nin; ++i) {
+    for (i = 0; i < nop; ++i) {
         int n = PyArray_NDIM(op[i]) - op_core_num_dims[i];
         if (n > broadcast_ndim) {
             broadcast_ndim = n;

is probably all there is to it. Then we get correct broadcasting of inputs to outputs, and any thoughts of deprecating this can just be deferred for future selves if it ever comes up.

EDIT: Plus the null check of op...

seberg added a commit to seberg/numpy that referenced this issue Jun 2, 2020
In this case no error was given, but additional dimensions in the
output simply ignored. Thus the returned array was only partially
set.
Now it will be fully set, with the calculation being repeated as
often as necessary (typical broadcasting logic).  This is consistent
with how normal ufuncs work.

Closes numpygh-16484
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants