New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ndarray should derive from collections.abc.Sequence? #2776
Comments
Python 3.x simply performs a more strict check in |
Fair enough, but that said, we could easily make ndarray a Sequence,
|
I would have said the same thing, if it weren't for the 2to3 example which makes this more subtle. The problem is that |
don't forget 0-d arrays though, I am not certain what happens with them right now, but they are not really a Sequence are they? |
Hmm, |
It looks to me like there's really no reason for |
It makes sense of you interpret the interface as "a sequence should have all these, but some may edit: Though I'm sure that was not a careless decision. |
ndarray could just be a Sequence, that does not mean it must be immutable. Btw. the CPython list implementation does also not support efficient insert, pop and extend either but its still a MutableSequence |
ndarray cannot support in-place insert/pop/extend (ndarray's have a fixed size), so while it is a mutable sequence, it simply is not a Python It would be nice if The point about 0-d arrays is a good one, though; 0-d arrays don't support the |
Note that MaskedArray has a 0-D arrays are better thought of as scalars with a few handy methods (at least that's how I think of them); besides not being iterable they're also not indexable which is even more weird if you think of them as arrays. So making 0-D arrays inconsistent in yet another way is not a big issue imho. |
@njsmith where do you see multiple implementations? After the isinstance(Sequence) check I only see |
Pandas Series and DataFrame types also have incompatible count methods, and an index attribute. |
@rgommers: Hmm, you're right, I got misled by the error message, and thinking that it also accepted integers as a shorthand for
That's also a good point about the existing ndarray subclasses. It doesn't look like there's any way to say that ndarray is a
class multidim_ndarray(ndarray, Sequence):
pass and make multi-dim arrays instances of this class instead. Subclasses aren't affected because they continue to inherit from
[The |
Maybe it would be possible to convince the python folks to create a new class like |
[Whoops, the "missing option" was there as option 4, just somehow the markdown parser decided to fold it into the previous bullet in a confusing and unreadable way. I've edited the comment to fix the formatting.] |
Half of the types that are registered with |
@rkern: Good catch. So maybe the solution is just to add a call to We should probably implement |
@rkern you're right, this is mentioned as an open issue in the PEP: http://www.python.org/dev/peps/pep-3119/#sequences. Odd that PEPs with status Final can even have open issues. |
I think the title of this bug is a bit misleading, as the bug does not exist only under python3. Sure, |
I've just encountered this bug in Python 2.7 using the autopep8 module. By default, it converted some of the operator.isSequenceType() calls into isinstance(x, collections.Sequence). The test would become False when I pass in a numpy.ndarray. This can be a very sneaky bug. |
Just encountered it as well with Python 2.7, using the python-pillow module. Image.point(lut, mode) calls isinstance(lut, collections.Sequence), the previous version used operator.isSequenceType() |
Now might be a good time to revisit this since the numpy numeric scalar types were registered (#4547), |
Yep, that's a good compromise. |
Yes, please simply add |
@jsbueno my problem is that I don't really see what additional, or in between definition would actually be helpful for users of I think SymPy actually got it more right. It iterates through all elements of its matrices, which at least makes it a With the risk of repeating a lot from above, aside from 1-D arrays, what are numpy arrays?:
So even some of the most fundamental properties clash. NumPy could be a Personally, my expectation is that 1-D arrays aside, array-likes are simply very different beasts compared to Python Collections. When considering the iteration behaviour, you would need a When looking beyond what is currently defined by |
The only thing would be to mark one dimensional arrays, and only one dimensional arrays as sequences, since they do not have the mismatch of subarray vs. element. At which point, yes, |
Thank you for the response, and I apologise again for jumping in the 8 year long wagon here. With your comment, couple hours after the last e-mail exchange, and chatting with another friend in the middle, I concur most of these things are better left as they are. Sometime in the future Python can opt to have a more formal definition of Sequence than "whatever collections.abc.Sequence implements now". I will just add, after reading your comments above, that I think that the characteristics you listed as "what makes a Python Sequence" is lacking the most important feature that makes ndarrays resemble sequences like lists and tuples for me: having a contiguous index-space that can address all individual elements. But I don't think formalizing an abc for that would be of any practical value, either in coding or in static-type hinting. |
@seberg That's a great synopsis. This issue seems to be about using class ndarray(Generic[T]):
def as_container(self) -> Container[T]:
if self.ndim == 0:
raise ValueError
return ContainerView(self) # correctly answers __len__, __iter__ etc.
def as_subarray_iterable(self) -> Iterable[np.ndarray[T]]:
if self.ndim <= 1:
raise ValueError
return SubarrayIterableView(self)
def as_scalar_sequence(self) -> Sequence[T]:
if self.ndim != 1:
raise ValueError
return ScalarView(self)
def as_subarray_sequence(self) -> Sequence[np.ndarray[T]]:
if self.ndim <= 1:
raise ValueError
return SubarraySequenceView(self) # this view has to reinterpret __contains__ to do the expected thing. Instead of |
Should that last annotation be |
@eric-wieser Yup! Thanks. What are your thoughts? |
Well, |
@eric-wieser Yeah, I thought it would be cheaper to provide a view, but I have no idea. |
Well, |
I still think we are focusing too much about what can be done and not enough on what the problems are at this time. In particular, all of the methods you give above are very easy to implement if you know that you have an ndarray-like (I disagree that 0-D arrays are not containers). So they would only be useful if there was a standardized ABC for them, and in that case it would also be sufficient to define that basic indexing is numpy compatible and maybe include the The original issue ( I am sure we do break some duck-typing code. Some problems probably occur with serialization (I don't have examples at hand). And many of such code will have no issue with using Methods such as the above may be reasonable as a consumer protocol to work with any array-like, but is that the problem we are trying to solve :)? They seem not like things typical Python sequences would want to expose. |
You're right of course, but you may not iterate over the whole sequence. You might only pick out a few elements.
I agree with you. What kind of problems are you imagining? I'm imagining that when numpy 1.10 comes out with types, I will sometimes want to use a one-dimensional numpy array as a sequence. If I want to do that currently, I need to:
That's why I want to provide a method to automatically do that. I hate big interfaces too, but it seems to me like these kinds of methods or bare functions are going to be more and more prevalent as type annotations catch on. What do you think?
I have no idea, but currently you are raising on
I didn't want to add that into my suggestion, but I think that's where you're headed anyway. I'm an avid user of the wonderfully-designed JAX library. I imagine that in the future, A few years ago, I searched for this issue in order to suggest that
Now that you mention that, maybe all of my proposed methods should have returned views. That way, they can correctly answer isinstance checks.
I definitely agree that the answer to this depends on the problems we are trying to solve. Having drunk the type annotation kool aid, I'm interested in writing succinct numpy code that passes mypy without littering code with |
Well, type hints and interop with other array-like objects are a good motivation probably. I might suggest opening a new issue or mailing list thread. Right now, I am not sure what best to think about here, typing is forming, so maybe that will end up clarifying some things. |
numpy arrays don't (and likely won't) fulfill the Sequence interface numpy/numpy#2776 numpy/numpy#7315 The array gets converted down to a tuple any way, and each element must be an integer too. Unit test modified to allow this behavior.
Without being an instance match [[1, 2], [3, 4]]:
case [a, b]:
print(a, b)
match np.array([[1, 2], [3, 4]]):
case [a, b]:
print(a, b) |
Right, but match np.zeros(()):
case []:
pass # Can't be iterated. |
Not sure what you tried to demonstrate here. Structural pattern matches seem to only care about unpacking, IIRC |
The concensus that has emerged in this discussion is that I don't know how to fix pattern matching to support NumPy arrays, but it would need to use a different approach. |
Alright, I forwarded this request to https://discuss.python.org/t/relax-requirement-for-sequence-pattern-matches-in-pep-622/22258 |
There's nothing wrong with only 1D arrays implementing This issue is still relevant with NumPy's recent typing. |
Can you think of any example of the value of an object affects its parent types? Have you read the thread linked from the comment above yours? |
@juliantaylor raised this in a pandas issue.
The example from the ticket:
This occurs on 3.3 with 1.7.0rc1.dev-3a52aa0, and on 3.2 with 1.6.2.
2.7 is unaffected of course.
The relavent code from cpython/Lib/random.py:297
I couldn't grep another location in the stdlib with a similar test, but lib2to3
did show an assumed equivalence:
in 2.7
but in 3.3/3.2
The text was updated successfully, but these errors were encountered: