Default int type is platform dependent #9464

eric-wieser · 2017-07-26T15:09:39Z

np.array([1]).dtype is platform-dependant, presumably because it defaults to np.int_

Is this by design?
If not, can we force it to int64?

The text was updated successfully, but these errors were encountered:

njsmith · 2017-07-26T21:16:12Z

It is by design – the idea is that numpy's default int type matches the range of python 2's int, which in turn matches the platform C compiler's long.

Whether this is a good design is another question, especially since python 3 has eliminated this. There have been intermittent discussion about changing it before that you can probably dig up – especially the confusing and error prone way the default is 32 bits on win64.

I suppose one way to move that discussion forward would be to test whether any major packages break if you do make that change.

pv · 2017-07-26T21:26:18Z

One thing that may break is if someone is using dtype=int and assumes this is somehow related to C long type...

juliantaylor · 2017-09-01T11:24:07Z

Changing the default int type on windows 64 to 64 bit would imo be an important enough change to warrant breaking software.
The current behavior just causes too many bugs.

That the default int type on 32 bit is 32 bit int is probably not so bad, as it does at least cover the full addressable range and changing it could have performance impact.

shoyer · 2018-03-20T04:57:58Z

We should seriously consider changing this.

In my experience, if a Python library of moderate complexity that uses NumPy does not run Windows specific tests, it probably broken for this reason.

KelSolaar · 2018-09-15T03:42:47Z

@shoyer : We ran into this exact problem on Windows with @MichaelMauderer on colour-science/colour#431.

I was assuming incorrectly that np.int_ was platform independent.

eric-wieser · 2018-09-15T03:57:35Z

Perhaps we should drop this default at the same time as python 2, since the sole reason for defaulting to np.int_ was that it matched the size of builtins.int, which in python 3 is not even true.

lawrence858 · 2018-10-10T16:26:39Z

Ideally numpy should behave the same way across platforms. A colleague of mine uses Windows and recently had to spend some time trying to figure out why a program was yielding different results on his machine than on my Mac. IMO performance considerations pale in comparison to getting correct and consistent results.

gojomo · 2020-07-07T06:32:25Z

Is there any runtime workaround a user could execute, before their other code, to force numpy-on-Windows default types to the same widths as elsewhere? (Perhaps, a data-driven, tamperable mapping of Python types to numpy types?)

As a fresh example of some of the resulting craziness, specifically asking for a array of a type compatible with type(2**32) results in an array that can't store 2**32:

2020-07-07T06:53:20.9528159Z     def testTiny(self):
2020-07-07T06:53:20.9528423Z         a = np.empty(1, dtype=type(2**32))
2020-07-07T06:53:20.9529046Z >       a[0] = 2**32
2020-07-07T06:53:20.9529318Z E       OverflowError: Python int too large to convert to C long

adeak · 2020-07-07T08:19:38Z

@gojomo I'm not sure that's a right approach anyway. On python 3 type(2**32) is guaranteed to be int, so that's just a more complicated way of saying dtype=int. If you're using a literal like that anyway you could of course use explicit dtype=np.int64.

To make it more dynamic, does dtype=np.array(2**32).dtype work? (Odds are there are even more idiomatic ways to do this.)
EDIT: np.empty_like(2**32, shape=...) is probably it, assuming that works.

seberg · 2020-07-07T13:49:50Z

No, I had a PR to add one, maybe I can open that again now that we decided to start the deprecation on some of the aliases: #16535

So either use dtype=np.intp which gives you 32bit on 32bit systems and 64bit on 64bit systems, or use dtype=np.int64 to begin with. That PR made dtype=np.intp the default, which is the simpler change, because intp is fairly common in NumPy already.

adeak · 2020-07-07T13:54:15Z

I was thinking that if NEP 31 ever happens, it would also make this kind of replacing defaults easily opt-in.

gojomo · 2020-07-07T22:12:46Z

@adeak My snippet's not a literal example; my actual issue is that I've got a list of many ints, which eventually reach 2**32, but a numpy array typed based on the first int breaks on Windows when it reaches 2**32, but works everywhere else.

(I was hoping the snippet highlighted some of the on-the-face absurdity of the Python-to-numpy interaction: shouldn't a reported type for a specific number specifically-communicate a corresponding type wide enough to store it? But I suppose Python is an equal contributor to the problem, as 2**65 & 2**129 have the same problem of reporting as simple int. So it's more a brain-teaser than a guide to better behavior.)

I'd answer the "is numpy's choice a good design?" question in @njsmith's 2017 comment as: "Reasonable way back when, but not anymore, with Python3, & the primacy of 64bit systems, and Microsoft's own phasing-out of WIndows 10's support for 32-bit systems."

Traffic on this issue since looks like it has referenced many places this has caused problems for people, but not yet any extant examples of code that'd break with a changed default. (There's probably some, somewhere.)

If the plunge of changing the default in one swoop is too risky, a call that opts-in to some minimum-width default (or user-chosen default) for all subsequent mappings of Python's int might help. (And then at some later date with warning to Windows users, change the default, but give laggards an option to change it back for a while.)

> Updated code to support both linux and windows (mainly signals and numpy array default type) Default int type is platform dependent numpy/numpy#9464 > Removed lsof that is not available on windows -- using psutil to detect and kill eva server > OCR and Facenet UDFs do not work on Windows.

nicolascofrer · 2023-02-15T10:32:17Z

Just realized this is the cause I'm getting different results on different machines. This is a very important issue. As others said, getting the same results seems way more important than "breaking something".

eric-wieser · 2023-02-15T11:22:31Z

As others said, getting the same results seems way more important than "breaking something".

"breaking something" is just another type of "not getting the same results" (between "now" and "then", as opposed to "windows" and "Linux")

nicolascofrer · 2023-02-15T11:50:12Z

I think there is a difference between breaking something with an error or warning as opposed to silently give different results. Consider numpy random problems for instance, that might be hard to debug because you dont know if the source of the difference is randomness or not.

seberg · 2023-02-15T12:05:39Z

I think there is a difference between breaking something with an error or warning as opposed to silently give different results.

Yes, which is exactly why this is not so simple. You must expect such a change will break a decent amount of users without any error if you change it.

fmaussion · 2023-02-15T16:00:58Z

I agree that changing this behavior can only be done silently (all of a sudden, all windows users will get int64 instead of int32 and that it will break an undetermined amount of code X).

At the same time, it will probably uncover painful bugs in an undermined amount of code Y and greatly simplify the behavior for learners or library developers.

What is the typical way to decide on this, when there is no way to estimate X or Y? The path of least resistance is definitely to do nothing in this case...

seberg · 2023-02-15T16:07:30Z

@fmaussion writing a brief NEP would be good. At this point, we should probably consider including such change in a major release (but it might still be good to summarize things in a NEP!), since I hope that isn't too far off.

I also suspect that the sane choice is probably (unfortunately) to switch to intp as default (i.e. 64bit on 64bit windows and not attempting any change e.g. on 32bit linux). But NEP would be the place to summarize that.

I added a switch for "use NumPy 2 behavior" very recently, so once there is some general consensus to push for this, there is also a path to start implementing it as planned for the major release.

fmaussion · 2023-02-15T16:15:20Z

Thanks @seberg this sounds reasonable - I'll see if I can find the bandwith to get things started but I'll certainly need help.

seberg · 2023-02-15T16:42:15Z

Don't hesitate to get in touch with me. There are too many things for the core team to push, so someone helping championing such change makes it much likely to happen!

xor2k · 2023-02-24T20:14:12Z

Hi everybody! I just experienced this problem in my project npy-append-array, compare

xor2k/npy-append-array#6

The problem was that while with MacOs and Linux, int64 is the default, for Windows it is int32, even if it is a 64 bit operating system. My solution was to basically replace all Numpy functions with their corresponding Python functions, like numpy.multiply.reduce with prod and numpy.ceil with ceil. I could also have specified dtype=np.int64 or dtype=np.int64 but that would be quite explicit and hopefully not necessary anymore in the future.

If this is API breaking or so, maybe it would be something for Numpy 2.0, wouldn't it?

xor2k · 2023-02-24T20:16:46Z

@fmaussion writing a brief NEP would be good. At this point, we should probably consider including such change in a major release (but it might still be good to summarize things in a NEP!), since I hope that isn't too far off.

I also suspect that the sane choice is probably (unfortunately) to switch to intp as default (i.e. 64bit on 64bit windows and not attempting any change e.g. on 32bit linux). But NEP would be the place to summarize that.

I added a switch for "use NumPy 2 behavior" very recently, so once there is some general consensus to push for this, there is also a path to start implementing it as planned for the major release.

I have no bandwidth either, but if you need someone to write that NEP, I can give it a shot.

joaoe · 2023-04-17T14:02:58Z

hi.
This issue causes serious compatibility problems between Windows 64 and Linux/Mac. E.g., another one unionai-oss/pandera#726

albertopasqualetto · 2023-08-18T14:01:30Z

I think that this issue should be written in the numpy.array documentation page, because the phrase "[...] NumPy will try to use a default dtype that can represent the values (by applying promotion rules when necessary.)" is misleading.

This commit fixes a platform-specific bug detected on Windows. One major issue is that Numpy's default integer type is platform-dependent, and it is not the same on Linux/macOS or Windows. This has since a while been a subject of discussion and the Numpy community does not seem to agree to make the default integer type the same on all platforms (see e.g. numpy/numpy#9464). Consequently, one must be very careful when manipulating ints on Windows. It turned out that wavelength definitions in mono modes were not cast to a floating-point type and remained as int32, leading to incorrect calculus when pass to the air scattering coefficient computation routine (which computes wavelength ** 4 and overflows int32 value bounds).

The test assumes that ray will infer the range as int64. However, it uses numpy to do the inference and numpy's integer inference is platform dependent: numpy/numpy#9464

eric-wieser mentioned this issue Jul 26, 2017

Absolute silently returning negative values (architecture dependent) #9463

Closed

WarrenWeckesser mentioned this issue Sep 2, 2017

ENH: random: Add multivariate_hypergeometric function. #8056

Closed

shoyer mentioned this issue Mar 20, 2018

Add NumPy scalar hierarchy numpy/numpy-stubs#14

Merged

godaygo mentioned this issue Mar 21, 2018

Default integer size in numpy and numba on Windows [continuation of #2643 issue] numba/numba#2729

Open

mattip added 23 - Wish List 54 - Needs decision component: numpy.dtype labels Aug 14, 2018

arvoelke mentioned this issue Nov 12, 2018

Integrator accuracy nengo/nengo-loihi#124

Merged

stefanpeidli mentioned this issue Jan 23, 2019

Windows bugfix when generating views of a single index. scverse/anndata#102

Merged

ihnorton mentioned this issue Feb 12, 2019

CI for mac and win (WIP) on Azure pipelines TileDB-Inc/TileDB-Py#107

Merged

tamuri mentioned this issue Mar 20, 2019

int32 default on windows machines raising errors UCL/TLOmodel#40

Closed

stuartarchibald mentioned this issue Sep 16, 2019

LoweringError in scanpy's calculate_qc_metrics function numba/numba#4529

Closed

tamuri mentioned this issue Oct 28, 2019

Running unit tests on windows fails UCL/TLOmodel#73

Closed

e-q mentioned this issue Nov 26, 2019

OverflowError in resample_poly (upfirdn) scipy/scipy#11128

Closed

peytondmurray mentioned this issue Dec 16, 2019

BUG: Explicit dtype specification in _upfirdn.py scipy/scipy#11229

Merged

timkpaine mentioned this issue Feb 22, 2020

Numpy int defaults to int32 on Windows, int64 on *nix finos/perspective#926

Closed

rzetter mentioned this issue Mar 26, 2020

Change int datatype check to int_ instead of int64 sigma-py/quadpy#252

Merged

uellue mentioned this issue Jul 13, 2020

Overflow fix for np.prod LiberTEM/LiberTEM#844

Merged

1 task

This comment was marked as off-topic.

Sign in to view

ssube mentioned this issue Feb 15, 2023

int64 vs int32 error using SD upscaling on Linux ssube/onnx-web#147

Closed

This comment was marked as off-topic.

Sign in to view

jni mentioned this issue Feb 17, 2023

Labels layer data types are different between operating systems napari/napari#5545

Open

albertopasqualetto mentioned this issue Aug 18, 2023

Decide whether the value of scalars (not just the type) should affect result type of mixed scalar/array operations #2878

Open

terrorfisch mentioned this issue Aug 28, 2023

Program creation fails after derialization qutech/qupulse#786

Closed

seberg mentioned this issue Oct 9, 2023

DISCUSS: What should the default integer type/dtype be #24890

Closed

This was referenced Feb 7, 2024

Add a simple nbytes representation in DataArrays and Dataset repr pydata/xarray#8702

Merged

ENH: Print Option: Always show an array's dtype #25787

Open

nspiller mentioned this issue Feb 21, 2024

int32 not sufficient for nbytes nel-lab/mesmerize-core#279

Merged

westonpace mentioned this issue Apr 11, 2024

fix: fix ray test broken on windows lancedb/lance#2188

Merged

tkoyama010 mentioned this issue May 30, 2024

Add decimate_polylines filter pyvista/pyvista#6153

Open

Aloqeely mentioned this issue Jun 4, 2024

BUG: astype() unexpected mutation of values pandas-dev/pandas#58907

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default int type is platform dependent #9464

Default int type is platform dependent #9464

eric-wieser commented Jul 26, 2017 •

edited

njsmith commented Jul 26, 2017

pv commented Jul 26, 2017

juliantaylor commented Sep 1, 2017 •

edited

shoyer commented Mar 20, 2018

KelSolaar commented Sep 15, 2018

eric-wieser commented Sep 15, 2018 •

edited

lawrence858 commented Oct 10, 2018

gojomo commented Jul 7, 2020 •

edited

adeak commented Jul 7, 2020 •

edited

seberg commented Jul 7, 2020

adeak commented Jul 7, 2020

gojomo commented Jul 7, 2020 •

edited

nicolascofrer commented Feb 15, 2023

eric-wieser commented Feb 15, 2023

nicolascofrer commented Feb 15, 2023

seberg commented Feb 15, 2023

This comment was marked as off-topic.

fmaussion commented Feb 15, 2023

seberg commented Feb 15, 2023

fmaussion commented Feb 15, 2023

seberg commented Feb 15, 2023

This comment was marked as off-topic.

xor2k commented Feb 24, 2023

xor2k commented Feb 24, 2023

joaoe commented Apr 17, 2023

albertopasqualetto commented Aug 18, 2023

Default int type is platform dependent #9464

Default int type is platform dependent #9464

Comments

eric-wieser commented Jul 26, 2017 • edited

njsmith commented Jul 26, 2017

pv commented Jul 26, 2017

juliantaylor commented Sep 1, 2017 • edited

shoyer commented Mar 20, 2018

KelSolaar commented Sep 15, 2018

eric-wieser commented Sep 15, 2018 • edited

lawrence858 commented Oct 10, 2018

gojomo commented Jul 7, 2020 • edited

adeak commented Jul 7, 2020 • edited

seberg commented Jul 7, 2020

adeak commented Jul 7, 2020

gojomo commented Jul 7, 2020 • edited

nicolascofrer commented Feb 15, 2023

eric-wieser commented Feb 15, 2023

nicolascofrer commented Feb 15, 2023

seberg commented Feb 15, 2023

This comment was marked as off-topic.

fmaussion commented Feb 15, 2023

seberg commented Feb 15, 2023

fmaussion commented Feb 15, 2023

seberg commented Feb 15, 2023

This comment was marked as off-topic.

xor2k commented Feb 24, 2023

xor2k commented Feb 24, 2023

joaoe commented Apr 17, 2023

albertopasqualetto commented Aug 18, 2023

eric-wieser commented Jul 26, 2017 •

edited

juliantaylor commented Sep 1, 2017 •

edited

eric-wieser commented Sep 15, 2018 •

edited

gojomo commented Jul 7, 2020 •

edited

adeak commented Jul 7, 2020 •

edited

gojomo commented Jul 7, 2020 •

edited