Unexpected output from arange with dtype=int #16159

davrot · 2020-05-05T14:10:20Z

In [3]: np.arange(-3, 0, 0.5, dtype=int)
Out[3]: array([-3, -2, -1, 0, 1, 2])

Well, to see a "1" and a "2" was a bit unexpected for us since both numbers are a bit bigger than 0.

Normally, this is the result without dtype=int:

In [2]: np.arange(-3, 0, 0.5)                                                  
Out[2]: array([-3. , -2.5, -2. , -1.5, -1. , -0.5])
and we should get this with dtype=int:
In [4]: np.arange(-3, 0, 0.5).astype(int)                                      
Out[4]: array([-3, -2, -2, -1, -1,  0])

The numpy manual states:
dtype : dtype
The type of the output array. If dtype is not given, infer the data type from the other input arguments.

Thus it should only effect the output array, right?

import numpy as np
print(np.arange(-3, 0, 0.5))
print(np.arange(-3, 0, 0.5, dtype=int))
print(np.arange(-3, 0, 0.5).astype(int))

Error message:

No error message...

Numpy/Python version information:

We tested it under numpy '1.18.4' (pure Python 3.7.6) as well as '1.18.1' (Anaconda 3.7 with the latest update applied). Same result.

1.18.4 3.7.6 (default, Feb 28 2020, 15:25:38)
[Clang 11.0.0 (https://github.com/llvm/llvm-project.git eefbff0082c5228e01611f7

1.18.1 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]

The text was updated successfully, but these errors were encountered:

eric-wieser · 2020-05-05T14:40:54Z

Bugs like this are reported over and over again. For reasons lost to time, I'm fairly confident the implementation of arange is something like:

def arange(start, stop, step, dtype):
    n = (start - stop) // step

    # dtype.type is a cast
    step = dtype.type(start + step) - dtype.type(start)

    # now do what you expect
    return [start + step*i for i in range(n)]

mattip · 2020-05-05T15:31:40Z

Perhaps we should add that pseudo-code to the documentation?

seberg · 2020-05-05T16:00:10Z

Yeah, that code is correct (not 100% sure about the n calculation though). This specific example is pretty extreme, and obviously broken, maybe we can actually get rid of it somehow?

arange is repeatedly hated for the arguably broken definition, but I cannot think of a really good proposal to address it (although maybe one came up before).
Its not like we can change arange behaviour for floats well (maybe precision fixups, but end-point changes are no good IMO). So we would need to create a new function... But then in most cases it seems to me that linspace is better than a "correct" float arange, I am not sure that a corrected float-arange actually has too many use-cases.

In the end, I guess I would like a well thought out proposal :/...

davrot · 2020-05-05T17:30:44Z

Getting values that are bigger than "stop" is really not nice and a bit unexpected. If arange is not for float, you could check for the floaty numpy types and raise an exception.

Also the manual entry for dtype really lets the user expect something like an astype(dtype) conversion of only the output.

How about:
1.) Exception for non-integer arguments (i.e. start, stop, step).
2.) Check if stop >= start, otherwise raise an exception
3.) Cast start, stop, step to int64 in the beginning of the function.
4.) astype(dtype) the output

Instead of 1.) you can redirect to linspace inside of arange if a non-integer input is found.

aryanxk02 · 2020-11-05T19:55:19Z

Hey I'm a complete beginner to open source contribution. Thought of giving it a try. How about this snippet? @eric-wieser

x = []
for i in range(start, stop):
    x.append(i)
    x.append(i+step)
print(np.array(x, dtype))

Bugs like this are reported over and over again. For reasons lost to time, I'm fairly confident the implementation of arange is something like:

def arange(start, stop, step, dtype):
    n = (start - stop) // step

    # dtype.type is a cast
    step = dtype.type(start + step) - dtype.type(start)

    # now do what you expect
    return [start + step*i for i in range(n)]

InessaPawson · 2023-03-08T17:57:00Z

@aryanxk02 We reviewed the solution you proposed at today's triage meeting. It wouldn't solve the issue. Thank you for giving it a go!

WarrenWeckesser added 00 - Bug component: numpy._core labels May 5, 2020

eric-wieser added this to Issues in np.arange May 5, 2020

eric-wieser mentioned this issue May 15, 2020

rounding errors in np.arange #16251

Closed

royjacobson mentioned this issue Jul 11, 2020

DOC: Added a warning about fractional steps in np.arange #16796

Merged

InessaPawson added the triage review Issue/PR to be discussed at the next triage meeting label Feb 21, 2023

InessaPawson added triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected output from arange with dtype=int #16159

Unexpected output from arange with dtype=int #16159

davrot commented May 5, 2020 •

edited by eric-wieser

eric-wieser commented May 5, 2020 •

edited

mattip commented May 5, 2020

seberg commented May 5, 2020

davrot commented May 5, 2020

aryanxk02 commented Nov 5, 2020

InessaPawson commented Mar 8, 2023

Unexpected output from arange with dtype=int #16159

Unexpected output from arange with dtype=int #16159

Comments

davrot commented May 5, 2020 • edited by eric-wieser

Error message:

Numpy/Python version information:

eric-wieser commented May 5, 2020 • edited

mattip commented May 5, 2020

seberg commented May 5, 2020

davrot commented May 5, 2020

aryanxk02 commented Nov 5, 2020

InessaPawson commented Mar 8, 2023

davrot commented May 5, 2020 •

edited by eric-wieser

eric-wieser commented May 5, 2020 •

edited