Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: dimension discovery fails when mixing scalars and shape==(1,) arrays #15075

Closed
mattip opened this issue Dec 9, 2019 · 5 comments
Closed

Comments

@mattip
Copy link
Member

mattip commented Dec 9, 2019

np.array([0.25, np.array([0.3])]) will fail to create a float array, the dtype will be object.

This seems wrong to me, is it intentional?

@eric-wieser
Copy link
Member

Cc @nschloe, who is against this type of thing and can link to their issue

@nschloe
Copy link
Contributor

nschloe commented Dec 9, 2019

Thanks for CCing. Yep, this really looks like something I've been going on about. 😺 #10404

I would argue that creating an object array is the correct thing here. After all, you're putting a float and an array together. If you created a Python list,

[0.25, np.array([0.3])]

you'd expect the same thing: The first entry is a float, the second an array of length 1. It would be confusing if lists and np.arrays behaved differently here.

Also, implicitly creating a dtype float array here would make it impossible to ever create a [float, vector[1]] array even if I wanted to.

Most of the time, specifying np.array([0.25, np.array([0.3])]) is done by mistake, and can easily be fixed; see, e.g., https://github.com/scipy/scipy/pull/11147/files#diff-21a6a0b0d89357857304bfba2da5a971L321. After all,

Explicit is better than implicit.

@mattip
Copy link
Member Author

mattip commented Dec 9, 2019

OK, closing. That PR would have made the recent NEP 34 changes (since reverted) less disruptive to scipy.

@mattip mattip closed this as completed Dec 9, 2019
@mattip
Copy link
Member Author

mattip commented Dec 10, 2019

@nschloe

implicitly creating a dtype float array here would make it impossible to ever create a [float, vector[1]] array even if I wanted to.

For the record, you could do np.array(0.3, np.array([0.3]), dtype=object).

If you created a Python list ...

NumPy ndarrays are very different from python lists. I would have no expectation that
np.array([0.2, 0.3, 0.4]) would create an object array, even though I did not specify np.float64 for the dtype. So we agree we are comfortable with some level of automatic, value-based dtype discovery. The question is what should gain precedence: numeric types or object types.

@nschloe
Copy link
Contributor

nschloe commented Dec 10, 2019

So we agree we are comfortable with some level of automatic, value-based dtype discovery.

The "automation" is quite clear here I think: Always get the "lowest" data type that can capture all input values:

numpy.array([1, 2]).dtype   # int64
numpy.array([1, numpy.array(2)]).dtype  # int64, array of rank 0 are basically scalars
numpy.array([1.0, 2]).dtype   # float64
numpy.array([1, [2]]).dtype   # O

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants