Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy.random.binomial(n) is O(n) with numba instead of O(1) #9534

Open
nh2 opened this issue Apr 21, 2024 · 1 comment
Open

numpy.random.binomial(n) is O(n) with numba instead of O(1) #9534

nh2 opened this issue Apr 21, 2024 · 1 comment
Labels
numpy performance - run time Performance issue occurring at run time.

Comments

@nh2
Copy link

nh2 commented Apr 21, 2024

I found that binomial(n) gets linearly slower as n increases.

This makes no sense: binomial() should be constant-time, independent of n.

Minimal repro: Without @njit, this finishes in 1 second on my laptop, with @njit it never finishes and lines output incrementally slower.

import numpy
from numba import njit

@njit  # with this on, `binomial(n)` get slower the bigger `n` gets
def main():
  for n in range(1000000):
    if n % 1000 == 0:
      print(n)
    numpy.random.binomial(n*n//100, 0.2)  # using n*n to show the effect even stronger


if __name__ == '__main__':
  main()

numba 0.59.1 on NixOS.

@stuartarchibald
Copy link
Contributor

Thanks for the report, the above reproduces locally. I think the issue is that in NumPy the algorithm branches to use the BTPE algorithm under conditions where n * p is larger (there's a note about it in the Numba implementation):

(Numpy uses BTPE for n*p >= 30, though)

Numba's numpy.random.Generator support for binomial uses the BTPE algorithm

def random_binomial_btpe(bitgen, n, p):
and also has the branching implemented
def random_binomial(bitgen, n, p):
if ((n == 0) or (p == 0.0)):
return 0
if (p <= 0.5):
if (p * n <= 30.0):
return random_binomial_inversion(bitgen, n, p)
else:
return random_binomial_btpe(bitgen, n, p)
else:
q = 1.0 - p
if (q * n <= 30.0):
return n - random_binomial_inversion(bitgen, n, q)
else:
return n - random_binomial_btpe(bitgen, n, q)

Perhaps one option would be to switch to using numpy.random.Generator().binomial(...) for now (note that to do this you'd have to pass a Generator instance into the Numba compiled function though)?

Fixing this probably isn't too hard, it just requires fixing up the numpy.random.binomial implementation to more closely follow that in NumPy.

@stuartarchibald stuartarchibald added numpy performance - run time Performance issue occurring at run time. labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
numpy performance - run time Performance issue occurring at run time.
Projects
None yet
Development

No branches or pull requests

2 participants