Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider opening a NEP to allow overrride of ndarray.__getitem__ dispatch #97

Open
telamonian opened this issue Oct 25, 2020 · 4 comments

Comments

@telamonian
Copy link
Contributor

telamonian commented Oct 25, 2020

When I first picked up and tried to start using ndindex to do cool things, the biggest pain point/stumbling block for me was the fact that you can't do

arr[ix]

and instead have to do

arr[ix.raw]

So let's change numpy to make the nicer syntax a reality! There's already one NEP under consideration that will make signficant improvements/simplifications to how complex indexing works:

https://numpy.org/neps/nep-0021-advanced-indexing.html

We could maybe include a proposal for improving the __getitem__ machinery (making it more flexible/overridable wrt the object being used as an index) as part of the ongoing implementation work for NEP 21, or possibly as a new NEP.

@asmeurer
Copy link
Member

I've thought about bringing this up.

There are really two different things that NumPy could do. One would be to have an API that allows any object to turn itself into an index type. That's more or less what we need for ndindex. This API could be relatively straightfoward. Either allow __index__ to work with more than just integers, or create a new method like __numpy_index__ that would return the corresponding index object. Having just __index__ work would be nice, but it might go against PEP 357.

A more advanced idea would be an API that allows objects to define indexing even that isn't possible with the current indexing types. This would be much more powerful, and would make it possible to do things like "outer indexing" via a special object (like a[oindex(...)] rather than a.oindex[...]). This goes beyond what ndindex needs in its current form, though it would open up some interesting possibilities. I don't know what such an API would look like.

@telamonian
Copy link
Contributor Author

Your first suggestion is exactly what I was thinking. I don't think it would involve any breaking changes to the numpy API, or cause the introduction of unexpected new behavior (a user would effectively have to "opt in" by trying to index an ndarray with some random object in the first place), so it seems feasible.

I don't entirely get the drift of your second suggestion, but it sounds cool. As for a[oix(...)], I think your first suggestion would cover that as well. What NEP 21 calls "vectorized" indexing can be used to build outer indexing (I think for all cases?), and pretty much anything else you could think of.

@asmeurer
Copy link
Member

asmeurer commented Oct 27, 2020

The first idea would basically be an API for custom objects that works like

def __numpy_index__(self):
    return <tuple, integer, slice, ellipsis, newaxis, or integer or boolean array>

The second would be something like

def __index_self__(self, array):
    # Represents array[self]
    return <whatever array[self] should return>

That would allow indexing anything, even things that aren't representable (or easily representable) by the current standard NumPy indices. I'm not sure exactly how that API should look. For example, would there be separate endpoints for getitem vs. setitem?

There are some interesting things you could do with an API like that. For example, you could create a fancy nonzero object so that a[nonzero] is equivalent to a[a!=0]. I agree that almost everything can already be done with integer arrays, but it may not be the most efficient, as it requires explicitly listing the index of every output. The API could also, potentially, be allowed to return something that isn't the same class as ndarray. One could use this to implement something like lazy indexing, for example.

But for ndindex, as I said, it isn't something we need. If it existed, it might be interested to build on top of it. Certainly many ndindex operations could be much easier with a generic API where I could just return the result directly, rather than trying to translate things as indices. But I'm also happy for ndindex to just be a library for manipulating the existing NumPy index types.

@asmeurer
Copy link
Member

I started a discussion on the NumPy mailing list https://mail.python.org/pipermail/numpy-discussion/2020-October/081103.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants