[WIP] fix some dimensionality inconsistencies #11147

nschloe · 2019-11-29T22:12:49Z

When working on numpy/numpy#10615, I noticed that scipy is in part inconsistent with whether it requires floats or arrays which may only contain one element. The work in numpy had me discover some bugs, e.g., arrays of obj type which look like

np.array([1.0, 4.0, np.array([2.5])])

which really wants to be float array

np.array([1.0, 4.0, 2.5])

I've fixed a bunch of those here.

Comments requested.

scipy/io/matlab/mio4.py

scipy/io/_fortran.py

scipy/optimize/_basinhopping.py

scipy/integrate/tests/test_bvp.py

scipy/optimize/_shgo.py

scipy/optimize/tests/test__numdiff.py

eric-wieser · 2019-11-30T11:33:01Z

scipy/optimize/tests/test__numdiff.py

@@ -185,6 +185,8 @@ def fun_non_numpy(self, x):
        return math.exp(x)

    def jac_non_numpy(self, x):
+        # Input can be float x or numpy.array([x]). Cast it all into numpy.array(x).


Again, this feels like a bug in the caller - it really ought to be certain what shape it calls this with.

This is from x0 in scipy/optimize/_numdiff.py. The doc says

x0 : array_like of shape (n,) or float Point at which to estimate the derivatives. Float will be converted to a 1-d array.

The Float will be converted to a 1-d array. is weird.

Honestly, it feels like what should actually happen is something like:

x0_flat = np.asarray(x0).ravel() x0_revised = do_linalgy_stuff_that_needs_x_1d(x0_flat) f(x0_revised.reshape(x0.shape))

I'll perhaps try a patch for that in the next few weeks

The purpose behind that docstring is to let the user know that f is going to be passed an array regardless of whether x is an array or a float.

scipy/optimize/lbfgsb.py

scipy/optimize/_linprog_ip.py

andyfaff · 2019-12-01T14:16:21Z

Hmm, this PR is touching a lot of files and it's currently not clear to me that all the changes are required. Is back compatibility being conserved for all the changes (assuming there are no bugs), have tests been added to that effect?

Because it touches so many files I don't think it's going to be possible to give a deep enough review of all the changes. Can you give a more graded approach, with each specific set of code files being addressed one by one, and with necessary tests added. In this situation it's better to write the tests first so a real issue is demonstrated, then with a fix applied.

Let's take the differential_evolution code as an example. For the changes you made I'd like to see a test that has the objective function either returning a float, or an array with shape (1,). Only if the test shows an issue would it be time to do something.

>>> from scipy.optimize import differential_evolution, rosen
>>> def wr(x):
...     return np.array([rosen(x)])

>>> differential_evolution(rosen, [(0, 2), (0, 1)])
     fun: 0.0
 message: 'Optimization terminated successfully.'
    nfev: 4863
     nit: 161
 success: True
       x: array([1., 1.])
>>> differential_evolution(wr, [(0, 2), (0, 1)])
     fun: 0.0
 message: 'Optimization terminated successfully.'
    nfev: 4653
     nit: 154
 success: True
       x: array([1., 1.])

Here, there don't seem to be any issues, and it's not worth the code churn.

eric-wieser · 2019-12-01T14:19:58Z

I would argue it is a bug that the second output does not say fun: [0.0]

scipy/optimize/_shgo.py

andyfaff · 2019-12-01T14:36:31Z

scipy/optimize/tests/test__shgo.py

@@ -56,7 +56,8 @@ class StructTest2(StructTestFunction):
    """

    def f(self, x):
-        return (x - 30) * numpy.sin(x)
+        # f must return a scalar
+        return (x[0] - 30) * numpy.sin(x[0])


There's no point in making these kinds of changes. That is unless there is a test showing that the minimisation goes wrong with before vs after.

One cannot optimize functions that return vectors. The change makes the expression mathematically sound and relieves numpy of the burden of having to convert your vector into a scalar in the background (which is what's happening now).

scipy/optimize/tests/test_optimize.py

pv · 2019-12-01T15:01:53Z

The tests perhaps should not be changed (apart from adding new ones), if we aim to preserve the API. They are probably representative of the user code that exists out there, and we'd like to keep that working.

…

On December 1, 2019 2:40:12 PM UTC, Andrew Nelson ***@***.***> wrote: andyfaff commented on this pull request. > @@ -177,17 +178,21 @@ def test_bfgs(self): def test_bfgs_infinite(self): # Test corner case where -Inf is the minimum. See gh-2019. - func = lambda x: -np.e**-x Not worth making these changes unless it can be shown that before/after doesnt't/does work. The churn isn't worth it for a change that doesn't look better anyway.

eric-wieser · 2019-12-01T15:05:50Z

I would argue it is a bug that the second output does not say fun: [0.0]

Actually, I've changed my mind here - optimize doesn't make sense on non-scalar results, given it's documented to "[minimize a] scalar function of one or more variables", and a 1-vector is not a scalar.

So the result of the second case is ok, but should be accompanied by a warning that the result is not a scalar.

nschloe · 2019-12-01T15:28:10Z

Actually, I've changed my mind here - optimize doesn't make sense on non-scalar results, given it's documented to "[minimize a] scalar function of one or more variables", and a 1-vector is not a scalar.

👍

They are probably representative of the user code that exists out there, and we'd like to keep that working.

It makes no sense to provide a vector-returning function at an optimization method. Optimization can only happen on scalars. If your function returns a vector, then probably something is wrong. It's a good thing to have a warning about that.

pv · 2019-12-01T15:38:43Z

It makes no sense to provide a vector-returning function at an optimization method

So far in numpy, size-1 arrays have been duck-scalars by definition in many contexts. Whether it makes sense or not does not matter in this context --- this has been how it has worked for 15 years, and we want to be sure the changes in this PR are not accidentally breaking the API compatibility.
This is why I would suggest to keep the tests as they are, and we can silence warnings as necessary, for the first pass of edits.

nschloe · 2019-12-01T17:11:53Z

Okay, let me break up some of the smaller changes.

mattip · 2019-12-09T14:45:42Z

Some of these fixes are needed for the upcoming change in numpy that will warn when creating an object array from np.array([0.25, np.array([0.3])]). That change was reverted for now, but will be reinstated.

nschloe · 2021-06-22T13:07:02Z

Closing in favor of #13864, #14274, and #14278.

nschloe added 4 commits November 29, 2019 14:10

small fix for bvp test

7a447c9

dim fix for mio4

8e8a119

remove unnecessary cast

c5473ee

more consistency with floats/arrays

08d579a

nschloe mentioned this pull request Nov 29, 2019

DEP: deprecate scalar conversions for arrays with ndim > 0 numpy/numpy#10615

Merged