Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: reduce the overhead of the checks in the multinomial function #20636

Closed
zephyr111 opened this issue Dec 21, 2021 · 8 comments
Closed

ENH: reduce the overhead of the checks in the multinomial function #20636

zephyr111 opened this issue Dec 21, 2021 · 8 comments

Comments

@zephyr111
Copy link
Contributor

zephyr111 commented Dec 21, 2021

Proposed new feature or change:

Hello,

We have discovered a performance regression of the numpy.random.multinomial function for small arrays apparently caused by the commit 0f3dd06 which add new checks. The checks are expensive regarding their purpose (especially for small arrays). This can be reproduced with Python 3.9.9 on a x86-64 machine using both Windows and Linux. Is there a way to reduce the overhead of such checks?

For more information, please read this Stack-Overflow post.

Thank you.

@mattip
Copy link
Member

mattip commented Dec 21, 2021

Saving a click: this microbenchmark

import numpy as np
from timeit import timeit
res = timeit('np.random.multinomial(1, [0.1, 0.2, 0.3, 0.4])', 
             number=1000000, globals=globals())
print(f'numpy version {np.__version__}: {res:.1f} seconds')

gives these results:

numpy version 1.16.6:  1.5 seconds # 10x faster
numpy version 1.18.1: 15.5 seconds
numpy version 1.19.0: 17.4 seconds
numpy version 1.21.4: 15.1 seconds

The commit was part of the enormous PR #13163 to integrate randomgen into numpy.

@mattip
Copy link
Member

mattip commented Dec 21, 2021

A very good and extensive analysis of the overhead is given by in the linked StackOverflow post linked to above, the analysis points to this line in mtrand.pyx causing 75% of the overhead

check_array_constraint(parr, 'pvals', CONS_BOUNDED_0_1)

and this line as responsible for 20% more

check_constraint(ni, 'n', CONS_NON_NEGATIVE)

The analysis also points out that calling multinomial 1,000,000 times for a single value in a tight loop is not a good design pattern.

@mattip
Copy link
Member

mattip commented Dec 21, 2021

Changing the code to use the new Generator

rng = np.random.default_rng()
res = timeit('rng.multinomial(1, [0.1, 0.2, 0.3, 0.4])', 
             number=1000000, globals=globals())

produces the same result since the similar code exists in _generator.pyx. The function check_array_constraint contains calls to np.any, np.all, and more which are not nicely optimized by Cython. They call back to the Python C-API to look up the np.any and np.all functions in numpy and call them with PyOjbect*.

We could refactor check_array_constraint faster by avoiding the C-API. I think using a 1d memory view and explicitly looping might work, but would make the code much more unwieldy.

@bashtage
Copy link
Contributor

The check_constraint can probably be improved since this is for scalars.

if cons == CONS_NON_NEGATIVE:
if not np.isnan(val) and np.signbit(val):
raise ValueError(name + " < 0")

I suspect the np. calls are not optimized by Cython.

@bashtage
Copy link
Contributor

bashtage commented Dec 22, 2021

Annotated Cython shows this is exactly the case:

image

There are a couple of places where Python API are used (the if and the first elif) that could be pretty easily avoided.

bashtage added a commit to bashtage/numpy that referenced this issue Dec 22, 2021
Remove python api calls from check_constraint

xref numpy#20636
@bashtage
Copy link
Contributor

Fixing the performance of the array checker in a tidy manner is a lot harder since the type of the array being checked is not always the same, and so would need paths for different possible types.

@wguocbp
Copy link

wguocbp commented Mar 14, 2022

Just installed the latest numpy and it works like a charm, thank you all for the fix and the unexpected fast turnaround, really impressed!

@mattip
Copy link
Member

mattip commented Mar 14, 2022

Closing since we improved what we can. @zephyr111 please open a new issue or reopen this one if you have an idea for more preformance improvements.

@mattip mattip closed this as completed Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants