Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: argmax2d #9283

Open
mortonjt opened this issue Jun 21, 2017 · 14 comments
Open

Request: argmax2d #9283

mortonjt opened this issue Jun 21, 2017 · 14 comments

Comments

@mortonjt
Copy link

mortonjt commented Jun 21, 2017

It would be really nice if there was a convenience function to perform an argmax over a 2D array and return the row and column indexes of the maximum.

I'm often finding myself reusing the following code

def argmax2d(X):
    n, m = X.shape
    x_ = np.ravel(X)
    k = np.argmax(x_)
    i, j = k // m, k % m
    return i, j

While it is simple, its opaque enough where I have to consistently look this function up.
It would be nice if this was already available in numpy.

Would there be any objections to opening up a PR in numpy to have this functionality readily available?

@krooijers
Copy link

Why not use np.unravel_index?

e.g.

np.unravel_index(X.argmax(), X.shape)

As a bonus, it works when X has more than 2 dimensions, as well.

@eric-wieser
Copy link
Member

Not a fan of argmax2d, but I could be persuaded by i, j = a.argmax(axis=(0, 1))

@adeak
Copy link
Contributor

adeak commented Jul 9, 2017

@eric-wieser how would you say that should behave? Currently a.argmax() returns a scalar, and a.argmax(axis=0) returns an array with the shape of the remaining dimensions. I'd expect a.argmax(axis=(0,1)) for a 3d array to return an array of shape a.shape[2].

This issue aside, it would be a bit weird to use a.argmax(axis=range(n)) in the general case to obtain an n-length index tuple instead of the default integer linear index. Perhaps a new keyword could switch between output representations?

Then again, how would this shape-aware result option work together with the existing axis keyword already resulting in a non-scalar return value?

@thoth291
Copy link

thoth291 commented Sep 7, 2018

What if one of the rows in your 2D array don't have a maximum (it's simply constant array) - how argmax can report that?
For example:

# setup the problem
import numpy as np
x=np.arange(10).reshape([2,5])
x[1,3]=2
# x is this:
# [[0, 1, 2, 3, 4],
#  [5, 6, 7, 2, 9]]

# this will behave normally
np.argmax(x==2, axis=1)
# Out: [2,3]

# but this one returns 0 instead of NaN
np.argmax(x==3, axis=1)
# Out: [3,0]
# Expected: [3, nan]

It would be nice to have an extra argument for example to let user control what to do in the case with no max available: np.argmax(x,axis,no_ind=0) (0 is default to preserve backward compatibility).

@adeak
Copy link
Contributor

adeak commented Sep 7, 2018

@thoth291 isn't this a more generic question regarding the behaviour of argmax? The same thing happens with no axis keyword argument on a 1d array:

>>> np.argmax([2,2,2])
0

And this is the customary behaviour in case of ties: choose the first among tied values. Similarly how the max of that array is not nan: it's 2. And if you have a max, you have to have a corresponding index. Or did I miss your point?

@thoth291
Copy link

thoth291 commented Sep 8, 2018

@adeak, now you said that - I'm starting thinking that this indeed a more general question.
I can agree that max([2,2,2]) is equal to 2.
But think about argmax([2,2,2]) - according to definition in numpy documentation it returns the indices corresponding to the first occurrence. This seems to be OK from the implementation point of view - but it's really just an archaism of the algorithm and has nothing to do with what actually should happen. Effectively ANY index can be returned and should be treated as correct. Or one can say that argmax is ambiguous in the case of constant-valued array.

That all being said, I would rather avoid such a decision in the first place and return inf to signal that it's up to the user to decide how to treat such case.

@eric-wieser

This comment has been minimized.

@eric-wieser
Copy link
Member

eric-wieser commented Sep 8, 2018

@adeak: Sorry for never replying

Then again, how would this shape-aware result option work together with the existing axis keyword already resulting in a non-scalar return value?

I think there's one obvious way to deal with this. As an example:

>>> ret = argmax(np.empty((A, B, C, D, E)), axes=(0, 2))
>>> type(ret)
tuple
>>> len(ret)  # == len(axes)
2
>>> ret[0].shape
(B, D, E)
>>> ret[1].shape
(B, D, e)

With that and keepdims, you'd get arr[argmax(arr, axes, keepdims=True)] == max(arr, keepdims=True) for any dimensionality, which seems super-desirable to me

In pseudocode, I'd expect:

def argmax(arr, axes, keepdims)
    was_tuple = isinstance(axes, tuple):
    if not was_tuple:
        axes = (axes,)

    shape = np.array(arr.shape)
    axis_mask = np.array([i in axes for i in range(arr.ndim)])
    shape[axis_mask] = 1
    ret = tuple(np.empty(shape) for _ in axes)

    # do the actual work

    if not keepdims:
        ret = tuple(r.reshape(shape[~axis_mask]) for r in ret)

    if not was_tuple:
        return ret[0]

@thoth291
Copy link

thoth291 commented Sep 8, 2018

@eric-wieser - nice catch - I typed it faster than was able to translate properly. Will be careful later. Updated the comment to "should happen" instead of "happens". Hope that helps, otherwise - open for suggestions for reformulation.
The point is that argmax([2,2,2])==0 is as good as argmax([2,2,2])==1 or anything else and it should be chosen by the user rather than library. Otherwise a fallback mechanism should be provided in the form of (for example) extra keyword default= or initial= which would instruct what to return in this case.

@eric-wieser
Copy link
Member

@thoth291: since what you're discussing isn't specific to 2D argmax, I suggest you create a new issue

@adeak
Copy link
Contributor

adeak commented Sep 9, 2018

@eric-wieser no worries, thanks for getting back to me. I think I understand your suggestion, and it does seem unambiguous and practical. The fact that the type of the axis argument determines whether the return value is an array or a tuple of arrays is a bit surprising to me (especially axes=0 vs axes=[0]), but I'm sure there are plenty of existing examples for the same behaviour (also, practicality beats purity).

@eric-wieser
Copy link
Member

Just to be sure

Both corrected, good catch

The fact that the type of the axis argument determines whether the return value is an array or a tuple of arrays is a bit surprising to me

Note that the actual array is the same. I think this hits both practicality and purity - the rule is that axis=(a,)res = (arr,) and axis=ares = arr. Apis that special case tuples of size 1 strike me as a a bad idea, as that special casing has to become contagious up the entire call stack

@adeak
Copy link
Contributor

adeak commented Sep 9, 2018

I never would've thought to special-case length-1 tuples, that sounds wrong. I was wondering about the scalar vs tuple case. It did vaguely occur to me that the scalar and length-1 tuple cases correspond to each other in the way you mentioned, but now that you spelled it out it's obvious that scalar -> 1-tuple -> general tuple cases are straightforward generalizations. So thanks, I fully support your suggestion.

@Jacob-Stevens-Haas
Copy link

It's an old issue to be sure, but this came up for me today. Given that 1.7 (released after this discussion) added the ability to pass a tuple axis parameter to np.max, it would be nice to see argmax accept the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants