Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing behavior with ellipses is non-intuitive. #16956

Closed
davidlibland opened this issue Jul 27, 2020 · 7 comments
Closed

Indexing behavior with ellipses is non-intuitive. #16956

davidlibland opened this issue Jul 27, 2020 · 7 comments
Labels
04 - Documentation 33 - Question Question about NumPy usage or development 57 - Close? Issues which may be closable unless discussion continued

Comments

@davidlibland
Copy link

Array indexing with ellipses can be extremely counter-intuitive. For instance, in the following example the introduction of a redundant ellipses alters the shape of the output:

a = np.array([[[False]]])
a[0:0, 0, ..., [0]]
Out[23]: array([], shape=(1, 0), dtype=bool)
a[0:0, 0,  [0]]
Out[24]: array([], shape=(0, 1), dtype=bool)

I don't think this is desired behavior, but it does seem to stem directly from design decisions with regards to how complex indexing is handled.

Numpy/Python version information:

1.17.3 3.7.5 (default, Oct 25 2019, 10:52:18)
[Clang 4.0.1 (tags/RELEASE_401/final)]

@eric-wieser
Copy link
Member

This looks like a particularly nasty corner case - I guess we consider the ... as forcing the two advanced indices, 0 and [0], to be considered non-adjacent - even though the axes they index are adjacent.

@antonl
Copy link

antonl commented Jul 27, 2020

We ran into this issue when trying to replicate numpy indexing. I think this is a weird interaction between two rules:

  • Ellipsis expands to a tuple of full slices, each of which is considered a basic index
  • Mixed order advanced and basic indices trigger a transpose operation.

This special case is triggered because a 0-d tuple (when ... is used with all indices present) is still considered a basic index block, when it should really be invisible.

@seberg
Copy link
Member

seberg commented Jul 27, 2020

I agree that this is all confusing, and I would not even have been certain without checking what happens here. But, it does seem like the correct choice to me.

Why should it be truly "invisible"? ... should behave identically for any number of dimensions. To do that, it has to trigger the transpose in all cases. I.e. imagine something organized as series, ..., color where ... is user-defined and can be 0-d. If you were to write a program to handle that data, you would require the indexing to be predictably transposed no-matter what ... expands (or does not expand) to.

In the end, this is simply confusing, and we would have to pick up .oindex, and .vindex etc. more seriously to resolve it: https://numpy.org/neps/nep-0021-advanced-indexing.html

@eric-wieser
Copy link
Member

I'm with @seberg's argument that the current behavior is the best generalization. We could certainly make sure the docs are clearer though.

@antonl
Copy link

antonl commented Jul 28, 2020

I didn't know about that proposal, thanks for the reference @seberg!

Let me rephrase just to make sure I understand your argument. Say we label advanced indices A and basic indices B and as above I'm calling the reordering op a (generalized) transpose. In your example, we have four cases:

  • [A1, ..., A2] and [B1, ... A1]: these cases trigger the transpose to [A1, A2, ....] and [A1, B1, ...] regardless of how ... expands.
  • [A1, ..., B1] and [B1, ..., B2]: these cases do not.

These rules are consistent once you know what class (A or B) series and color belong to, independent of how ... expands. This is equivalent to treating the ... as a (potentially 0-d) block of basic indices. Treating the 0-d ... as a special case would be bad because the transpose would be conditional on whether the user passed in a 2-D array or 3-or-more-D. I agree, this is a bad place to be.

My intuition that 0-d blocks should be no-op was driven by how tuples behave, where for any i,

i: int
M: Tuple[int, ...]
N = ()
assert M[:i] + N + M[i:] == M

This conflicts with the numpy indexing convention which results in the above four cases depending on what i is. This has more to do with the transpose operation itself, rather than how ... is treated, and is an argument for the NEP21 proposal.

During development we would switch how we did the indexing and would insert ... to future-proof the code and be really confused when the shapes magically transposed. It was just exacerbated by the empty ... case being particularly counterintuitive.

Thanks for taking a look! I can submit a doc PR if you'd like.

@seberg
Copy link
Member

seberg commented Aug 3, 2020

@antonl a doc PR is always welcome, there are many improvements possible here.

@seberg seberg added 04 - Documentation 33 - Question Question about NumPy usage or development 57 - Close? Issues which may be closable unless discussion continued labels Aug 3, 2020
@seberg
Copy link
Member

seberg commented Nov 22, 2022

Closing, docs can probably be improved, but they always can and there is nothing concrete here (beyond implementing NEP 21 one day).

@seberg seberg closed this as completed Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
04 - Documentation 33 - Question Question about NumPy usage or development 57 - Close? Issues which may be closable unless discussion continued
Projects
None yet
Development

No branches or pull requests

4 participants