Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enh: Object array creation function #5933

Open
toddrjen opened this issue Jun 2, 2015 · 4 comments
Open

Enh: Object array creation function #5933

toddrjen opened this issue Jun 2, 2015 · 4 comments

Comments

@toddrjen
Copy link

toddrjen commented Jun 2, 2015

As discussed in issue #5303, currently it is not possible to create arrays of object dtype containing equal-length sequences, since the sequence is automatically read in as array elements. There is a suggestion to only do this for lists, but this would be a major backwards compatibility break and would require a long deprecation period.

Another approach would be to have a function explicitly for creating arrays with an object dtype. Perhaps this could be called "objectarray". The default for this function would be to take in a sequence, and consider each element of the sequence as an element in a 1D object array.

The function, however, could have an optional "ndim" or "depth" argument, that could be used to specify how many levels of the sequence should be considered part of the array. This would default to 0 (only the outermost level is considered). This would raise an exception if the dimensions don't match.

Note that this approach is not mutually exclusive with the alternative, but has the advantage that it wouldn't break backwards-compatibility.

So for example:

>>> arr = objectarray([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))])
>>> arr
array([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))], dtype=object)
>>> arr.shape
(2,)

>>> arr = objectarray([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))], depth=1)
>>> arr
array([[(1, 2, 3), (4, 5, 6)],
       [(7, 8, 9), (10, 11, 12)]], dtype=object)
>>> arr.shape
(2, 2)

>>> arr = objectarray([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))], depth=2)
>>> arr
array([[[1, 2, 3],
        [4, 5, 6]],

       [[7, 8, 9],
        [10, 11, 12]]], dtype=object)
>>> arr.shape
(2, 2, 3)
@ahaldane
Copy link
Member

I think the easiest way to get equal sized lists into an object array is in two steps:

>>> a = empty((2,), dtype=np.object)
>>> a[:] = [[1,2,3],[4,5,6]]

>>> b = empty((2,3), dtype=np.object)
>>> b[:] = [[1,2,3],[4,5,6]]

Probable an implementation of objectarray would work like this.

@toddrjen
Copy link
Author

Yes, that is currently the best way, but it is needlessly verbose. Hence this idea.

I would hope that an implementation of this idea would simply be able to bypass the automatic conversion used in the array function and substitute its own to the ndarray constructor.

@toobaz
Copy link

toobaz commented Dec 4, 2017

Hope I'm not missing anything, but it seems to me that a ndmax argument would not only solve the problem reported ("create arrays of object dtype containing equal-length sequences"), but also bring performance gains in those cases in which e.g. the last object in the input is not a list (or is a list with different length). Also see this question.

@bergkvist
Copy link

Any progress or on or plans to implement ndmax? What I'm doing right now:

np.array([*data, None])[:-1]

# This would look a lot cleaner:
np.array(data, ndmax=1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants