Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to set float32 as default #6860

Closed
fayeshine opened this issue Dec 19, 2015 · 21 comments
Closed

How to set float32 as default #6860

fayeshine opened this issue Dec 19, 2015 · 21 comments

Comments

@fayeshine
Copy link

I use cuBLAS + numpy, cuBLAS run very fast on float32, 10times faster than CPU.
However, I need to set dtype=float32 everytime by hand, it's tedious. random.rand() even doesn't support to create float32 array.
Is there anyway to set the default precision to float32 in numpy?

@njsmith
Copy link
Member

njsmith commented Dec 19, 2015

There isn't, sorry. And I'm afraid we're unlikely to add such a thing because it would necessarily have to be global state, and this kind of global state tends to create all kinds of problems (e.g. people will try changing the default inside a library, and then unrelated code that happens to use this library will start seeing weird problems when the unrelated code tries to use numpy).

You could make your own utility functions and use those, e.g.:

def array(*args, **kwargs):
    kwargs.setdefault("dtype", np.float32)
    return np.array(*args, **kwargs)

@njsmith njsmith closed this as completed Dec 19, 2015
@JulesGM
Copy link

JulesGM commented Mar 3, 2018

This would be quite useful.

@javidcf
Copy link
Contributor

javidcf commented Sep 20, 2018

This is old, but it would still be useful (you can find a handful of questions on Stack Overflow asking about it). May I add, a global library state is not really the only option for this. You could have an environment variable, a file configuration or even just a context manager. For example Theano offers a configuration file and a environment variable. I imagine you could have a default float size (like Theano's floatX) and maybe a default integer size (and even a default complex size if you want to push it?). Also, it is not nearly as significant, but there is already at least some global state in NumPy, e.g. set_printoptions (which you could in principle mess up from a library, or from different threads); maybe having a uniform way of configuring the library is not such a bad idea.

I'm not saying it is straightforward, as it probably affects a great portion of the code, and surely there is a lot of corner cases to it, but I think it may be worth considering, even if only as a potential roadmap item.

@JulesGM
Copy link

JulesGM commented Sep 20, 2018

Especially as how with deep learning (tensorflow, pytorch, etc), people are manipulating arrays of precision smaller than 64 bits pretty much 100% of the time (mainly 32 bits, but mixed precision and quantized models are gaining a lot of ground, with official support from all top vendors)

@asm95
Copy link

asm95 commented Oct 19, 2018

I have exactly the same problem. Having some trouble with very large matrices in a very long module that makes many calls to np.array. Can't change all calls to specify the optional argument (dtype=np.float32). I just want to tell numpy to use float32 instead of float64. OS is swapping now. Please help.

@soulslicer
Copy link

I hate that I have to do this everytime

@mattip
Copy link
Member

mattip commented Jan 19, 2019

@soulslicer this issue is closed, we will not be changing this in the conceivable future. Perhaps monkey-patching np.array to add a default dtype would solve your problem. You can arrange for this to be called at python startup via PYTHONSTARTUP for interactive work, or put it in a file and import at project startup.

import numpy as np
_oldarray = np.array
def array32(*args, **kwargs):
    if 'dtype' not in kwargs:
        kwargs['dtype] = 'float32'
    _oldarray(*args, **kwargs)
np.array = _oldarray

@godaygo
Copy link
Contributor

godaygo commented Jan 19, 2019

heh, another way ;)

from functools import partial
import numpy as np
array32 = partial(np.array, dtype=np.float32)

@JulesGM
Copy link

JulesGM commented Jan 19, 2019

FYI with deep neural networks becoming so huge, more and more people will be after this feature.

@JadBatmobile
Copy link

Lol @ numpy

@JadBatmobile
Copy link

hey i want each number occupying 38 gigs on your computer

@JulesGM
Copy link

JulesGM commented Feb 22, 2019

That's not what's at stake here @JadBatmobile

@adeak
Copy link
Contributor

adeak commented Feb 24, 2019

njsmith explained in clear terms 3 years ago why this "feature" would very easily (read: in one line of code) lead to a lot of latent and non-local bugs. Such a "feature" should only be used responsibly. I don't think implementing features that need to be used responsibly is a good idea. If you know you're using it, and going to use it responsibly: choose one from the several suggestions mentioned in this thread (and even more elsewhere), and make your own code explicitly behave this way.

@dankal444
Copy link

@adeak I am not sure if this is good idea, but maybe some context manager would be a good compromise?

Pseudocode:

@contextmanager
def default_dtype(dtype):
    # read current default dtype and change to the one provided
    original_dtype = read_current_default_dtype()
    change_default_dtype(dtype)
    yield
    # change default dtype to original value
    change_default_dtype(original_dtype)

Usage:

with np.default_dtype(np.float32):
    # do float32 stuff

@adeak
Copy link
Contributor

adeak commented Feb 25, 2019

@dankal444 if I understand correctly nothing would stop people from being lazy and calling the ominous change_default_dtype(dtype) manually, with no guarantee for cleanup.

@dankal444
Copy link

@adeak I thought that this "ominous" method could be hidden from people perspective and only context manager made available

@adeak
Copy link
Contributor

adeak commented Feb 25, 2019

I suspect that the people demanding this feature wouldn't be happy with a context manager; that would be even more cumbersome than a single custom configuration step to be done once. People could just start using the non-public function that has global state to get it over with, defeating the purpose.

@seberg
Copy link
Member

seberg commented Feb 25, 2019

I do not think context managers help much. You will always get into the issue that you may use downstream functions that use the larger precision for a good reason, and you just break them. Heck, you may even cause segfault, because C-interfacing code has hardly a reason to double check that a freshly created array has the wrong datatype.

@zhezh
Copy link

zhezh commented Mar 1, 2019

I find in low level, there is a NPY_DEFAULT_TYPE, maybe numpy can provide a function to modify this Variable value to float32?

It is really a pain to declare np.float32 dtype when creating a new array

https://docs.scipy.org/doc/numpy-1.15.1/reference/c-api.dtype.html?highlight=default_type#c.NPY_DEFAULT_TYPE

@lu4
Copy link

lu4 commented Apr 27, 2019

2zly45

[Bob] How can I create a random float32 array that consumes 90% of available RAM?
[Numpy] Just double the RAM...

Everyone has an opinion these days that lurks for expression, mine is that this is probably one of the most "insane and ruthless"* design decisions I've ever seen, deserves a rightful nomination my private hall of fame

"insane and ruthless"* - idiomatic expression originating from Russian

[Aphorism 1] If it's limiting then it doesn't matter how slim is your architecture.
[Aphorism 2] In many cases "pythonic" is just a label, the last one that covers shame

@numpy numpy locked and limited conversation to collaborators Apr 27, 2019
@njsmith
Copy link
Member

njsmith commented Apr 27, 2019

Again: the reason for not implementing this is not that we like wasting your memory, it's that it will break all kinds of stuff and cause you to silently get wrong answers. The fact that so many people think it's an "obvious" thing to do confirms that most people don't understand the full consequences here, and wouldn't be prepared to judge when this feature is safe to use and when it isn't.

I hear the pain you all are experiencing; that's totally valid, and we would like to help if we can. But to do that someone has to come up with a plan that doesn't break everything.

Locking this issue since it's clearly a magnet for unproductive comments. If you have a new idea, please open a new issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests