-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default int type is platform dependent #9464
Comments
It is by design – the idea is that numpy's default int type matches the range of python 2's Whether this is a good design is another question, especially since python 3 has eliminated this. There have been intermittent discussion about changing it before that you can probably dig up – especially the confusing and error prone way the default is 32 bits on win64. I suppose one way to move that discussion forward would be to test whether any major packages break if you do make that change. |
One thing that may break is if someone is using dtype=int and assumes this is somehow related to C long type... |
Changing the default int type on windows 64 to 64 bit would imo be an important enough change to warrant breaking software. That the default int type on 32 bit is 32 bit int is probably not so bad, as it does at least cover the full addressable range and changing it could have performance impact. |
We should seriously consider changing this. In my experience, if a Python library of moderate complexity that uses NumPy does not run Windows specific tests, it probably broken for this reason. |
@shoyer : We ran into this exact problem on Windows with @MichaelMauderer on colour-science/colour#431. I was assuming incorrectly that |
Perhaps we should drop this default at the same time as python 2, since the sole reason for defaulting to |
Ideally numpy should behave the same way across platforms. A colleague of mine uses Windows and recently had to spend some time trying to figure out why a program was yielding different results on his machine than on my Mac. IMO performance considerations pale in comparison to getting correct and consistent results. |
Is there any runtime workaround a user could execute, before their other code, to force numpy-on-Windows default types to the same widths as elsewhere? (Perhaps, a data-driven, tamperable mapping of Python types to numpy types?) As a fresh example of some of the resulting craziness, specifically asking for a array of a type compatible with
|
@gojomo I'm not sure that's a right approach anyway. On python 3 To make it more dynamic, does |
No, I had a PR to add one, maybe I can open that again now that we decided to start the deprecation on some of the aliases: #16535 So either use |
I was thinking that if NEP 31 ever happens, it would also make this kind of replacing defaults easily opt-in. |
@adeak My snippet's not a literal example; my actual issue is that I've got a list of many ints, which eventually reach (I was hoping the snippet highlighted some of the on-the-face absurdity of the Python-to-numpy interaction: shouldn't a reported type for a specific number specifically-communicate a corresponding type wide enough to store it? But I suppose Python is an equal contributor to the problem, as I'd answer the "is numpy's choice a good design?" question in @njsmith's 2017 comment as: "Reasonable way back when, but not anymore, with Python3, & the primacy of 64bit systems, and Microsoft's own phasing-out of WIndows 10's support for 32-bit systems." Traffic on this issue since looks like it has referenced many places this has caused problems for people, but not yet any extant examples of code that'd break with a changed default. (There's probably some, somewhere.) If the plunge of changing the default in one swoop is too risky, a call that opts-in to some minimum-width default (or user-chosen default) for all subsequent mappings of Python's |
> Updated code to support both linux and windows (mainly signals and numpy array default type) Default int type is platform dependent numpy/numpy#9464 > Removed lsof that is not available on windows -- using psutil to detect and kill eva server > OCR and Facenet UDFs do not work on Windows.
Just realized this is the cause I'm getting different results on different machines. This is a very important issue. As others said, getting the same results seems way more important than "breaking something". |
"breaking something" is just another type of "not getting the same results" (between "now" and "then", as opposed to "windows" and "Linux") |
I think there is a difference between breaking something with an error or warning as opposed to silently give different results. Consider numpy random problems for instance, that might be hard to debug because you dont know if the source of the difference is randomness or not. |
Yes, which is exactly why this is not so simple. You must expect such a change will break a decent amount of users without any error if you change it. |
This comment was marked as off-topic.
This comment was marked as off-topic.
I agree that changing this behavior can only be done silently (all of a sudden, all windows users will get int64 instead of int32 and that it will break an undetermined amount of code X). At the same time, it will probably uncover painful bugs in an undermined amount of code Y and greatly simplify the behavior for learners or library developers. What is the typical way to decide on this, when there is no way to estimate X or Y? The path of least resistance is definitely to do nothing in this case... |
@fmaussion writing a brief NEP would be good. At this point, we should probably consider including such change in a major release (but it might still be good to summarize things in a NEP!), since I hope that isn't too far off. I also suspect that the sane choice is probably (unfortunately) to switch to I added a switch for "use NumPy 2 behavior" very recently, so once there is some general consensus to push for this, there is also a path to start implementing it as planned for the major release. |
Thanks @seberg this sounds reasonable - I'll see if I can find the bandwith to get things started but I'll certainly need help. |
Don't hesitate to get in touch with me. There are too many things for the core team to push, so someone helping championing such change makes it much likely to happen! |
This comment was marked as off-topic.
This comment was marked as off-topic.
Hi everybody! I just experienced this problem in my project The problem was that while with MacOs and Linux, If this is API breaking or so, maybe it would be something for Numpy 2.0, wouldn't it? |
I have no bandwidth either, but if you need someone to write that NEP, I can give it a shot. |
hi. |
I think that this issue should be written in the numpy.array documentation page, because the phrase "[...] NumPy will try to use a default dtype that can represent the values (by applying promotion rules when necessary.)" is misleading. |
This commit fixes a platform-specific bug detected on Windows. One major issue is that Numpy's default integer type is platform-dependent, and it is not the same on Linux/macOS or Windows. This has since a while been a subject of discussion and the Numpy community does not seem to agree to make the default integer type the same on all platforms (see e.g. numpy/numpy#9464). Consequently, one must be very careful when manipulating ints on Windows. It turned out that wavelength definitions in mono modes were not cast to a floating-point type and remained as int32, leading to incorrect calculus when pass to the air scattering coefficient computation routine (which computes wavelength ** 4 and overflows int32 value bounds).
This commit fixes a platform-specific bug detected on Windows. One major issue is that Numpy's default integer type is platform-dependent, and it is not the same on Linux/macOS or Windows. This has since a while been a subject of discussion and the Numpy community does not seem to agree to make the default integer type the same on all platforms (see e.g. numpy/numpy#9464). Consequently, one must be very careful when manipulating ints on Windows. It turned out that wavelength definitions in mono modes were not cast to a floating-point type and remained as int32, leading to incorrect calculus when pass to the air scattering coefficient computation routine (which computes wavelength ** 4 and overflows int32 value bounds).
This commit fixes a platform-specific bug detected on Windows. One major issue is that Numpy's default integer type is platform-dependent, and it is not the same on Linux/macOS or Windows. This has since a while been a subject of discussion and the Numpy community does not seem to agree to make the default integer type the same on all platforms (see e.g. numpy/numpy#9464). Consequently, one must be very careful when manipulating ints on Windows. It turned out that wavelength definitions in mono modes were not cast to a floating-point type and remained as int32, leading to incorrect calculus when pass to the air scattering coefficient computation routine (which computes wavelength ** 4 and overflows int32 value bounds).
The test assumes that ray will infer the range as int64. However, it uses numpy to do the inference and numpy's integer inference is platform dependent: numpy/numpy#9464
np.array([1]).dtype
is platform-dependant, presumably because it defaults tonp.int_
int64
?The text was updated successfully, but these errors were encountered: