New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is nanmin
so slow relative to numpy?
#256
Comments
ping numba/numba#2196? |
Very possibly, nice find. Though that bottleneck performs similarly to us means it's a bit less likely to be an LLVM issue... |
My guess is that NumPy is better at using vectorized CPU instructions for some reason. No idea why Numba can't do this, though... |
Well if you want to go down a rabbit hole: https://tbetcke.github.io/hpc_lecture_notes/simd.html ;) |
Nice! V interesting. So hopefully this will improve in future numba versions... |
I recently added numpy to the benchmarks, and numbagg overall does quite well.
But it does very badly on
nanmin
&nanmax
— about 85% slower. Bottleneck performs similarly to numbagg.Even when I strip out anything that's non-essential to calculating a minimum, it doesn't help performance:
pytest -vv --benchmark-enable -k 'benchmark_main and [nanmin and shape1' --run-nightly
Check out the
mean
column1:Without solving this, we can't really recommend numbagg as a replacement for aggregation functions as we as grouping & moving window functions. I'm not even sure we should have the functions in numbagg — probably we should at least demote them outside the top namespace.
@shoyer if you happen to know off-hand given your experience here, let me know. No need to reply if not...
Footnotes
Part of me is wondering whether it's even doing the same operation. But we also have tests on the correctness relative to
numpy
, so it is returning the same result... ↩The text was updated successfully, but these errors were encountered: