Speed up the "slow", not compiled with numba.jit, _fisher_jenks_means by 4 to 5 times. Saves 3 minutes classifiying 8000 data points. #201

JamesParrott · 2023-12-14T17:24:45Z

Hi everyone,

Thanks for a great project, and for all your work. I really like mapclassify. This PR is for information only, to highlight an interesting way to improve performance for users not using Numba. Using Numba should be preferred - the version with it is blazingly fast. Given so many other dependencies are needed, I don't know why it's optional, even if llvmlite is very big.
I would say Numba should be included by default.

Otherwise, if there's a willingness to relax and generalise the input types, and instance variables created by MapClassifier (to lists of numbers, instead of only numpy arrays), technically making a breaking change, then this PR is a good UX improvement for casual mapclassify users. I don't know if something else that takes even longer is on the critical path for them instead (or if typical mapclassify users who use the default mapclassify without Numba, have much more powerful machines than me), if they all classify much much larger data sets than I've tested, but I myself would always prefer a library that gets the answer 3 minutes faster.

From running mapclassify\tests\time_fisher_jenkss.py:

Background:
I've made a fork to target IronPython in Grasshopper, in which Numpy and numba are not directly available. This got me wondering how much performance its users would be missing out on. Compared to Numba users, they might have to wait 30 seconds more for 3000 data points. But compared to non-Numba users, the answer, to my surprise, is not a lot. And in fact, compared to your own users who haven't installed numba, my fork that doesn't use Numpy at all is significantly faster, then the slow version of _fisher_jenks_means without Numba (that someone saw fit to raise a warning about in FisherJenks).

Basically, creating and iterating over numpy arrays is not free - there's actually an overhead. That overhead is well worth it for large matrix multiplications, and other vectorisable calculations in scientific computing on modern hardware. But on my laptop at least, running Python 3.11, using an implementation of an algorithm like the original fisher jenks, that must iterate over every element in sequence anyway, Numpy arrays are slower than plain Python iteration over nested lists.

I don't know how Numpy implemented it in C exactly, but when I tried to implement __getitem__ on tuples in pure Python, there was a an even larger performance hit. So the only change I've made in my alternative function, is to switch [i, j] with the admittedly uglier [i][j], to generalise a numpy specific type reference, and to pass in a Class for numpy (np), that defines int and float etc.. Hopefully otherwise, the original code in _fisher_jenks_means is unaltered.

python time_fisher_jenkss.py
Run times of the proposed function vs the original (excluding MapClassifier overhead)

Time: 0.009 seconds, data points: 100  _fjm_without_numpy, number_tests=1
Time: 0.081 seconds, data points: 300  _fjm_without_numpy, number_tests=1
Time: 0.911 seconds, data points: 1000  _fjm_without_numpy, number_tests=1
Time: 7.298 seconds, data points: 2800  _fjm_without_numpy, number_tests=1
Time: 63.026 seconds, data points: 8000  _fjm_without_numpy, number_tests=1
Time: 0.044 seconds, data points: 100  _fisher_jenks_means, number_tests=1
Time: 0.381 seconds, data points: 300  _fisher_jenks_means, number_tests=1
Time: 4.209 seconds, data points: 1000  _fisher_jenks_means, number_tests=1
Time: 33.791 seconds, data points: 2800  _fisher_jenks_means, number_tests=1
Time: 315.014 seconds, data points: 8000  _fisher_jenks_means, number_tests=1
Run times for end user, of the proposed code vs the original (inc MapClassifier overhead)

\mapclassif-Iron\mapclassify\tests\time_fisher_jenkss.py:56: UserWarning: Numba not installed. Using a less slow, pure python version.
  mapclassify.classify(y = data, scheme = 'fisherjenks', k=k)
Time: 0.042 seconds, data points: 100 without Numpy, much less slow, pure python code, number_tests=1
Time: 0.154 seconds, data points: 300 without Numpy, much less slow, pure python code, number_tests=1
Time: 1.559 seconds, data points: 1000 without Numpy, much less slow, pure python code, number_tests=1
Time: 12.467 seconds, data points: 2800 without Numpy, much less slow, pure python code, number_tests=1
Time: 105.765 seconds, data points: 8000 without Numpy, much less slow, pure python code, number_tests=1
Time: 0.046 seconds, data points: 100 with Numpy, existing "slow pure python" code, number_tests=1
Time: 0.385 seconds, data points: 300 with Numpy, existing "slow pure python" code, number_tests=1
Time: 4.263 seconds, data points: 1000 with Numpy, existing "slow pure python" code, number_tests=1
Time: 34.077 seconds, data points: 2800 with Numpy, existing "slow pure python" code, number_tests=1
Time: 296.244 seconds, data points: 8000 with Numpy, existing "slow pure python" code, number_tests=1

Fixes #118 if Numba is unavailable

…f Numpy)

…. Remove unused HAS_NUMPY check

This reverts commit 34c7a25.

knaaptime · 2023-12-14T19:42:40Z

thanks for this :) we'll take a look. (funny timing, actually, because i have some brand new materials that use jenks (for the first time in ages) albeit in an python 3.12 environment where numba isnt yet available and the slowdown is very noticeable on a medium-sized dataset)

JamesParrott · 2023-12-14T20:45:59Z

thanks for this :) we'll take a look.

You're welcome. I didn't realise there were more tests than I spotted at first glance (I've delved into those and fixed them by the way...). Commit 0b48854 makes self.bins a numpy array again. So including this PR no longer change the return typedbreak anyone's code now.

jGaboardi · 2023-12-14T21:44:45Z

@JamesParrott make sure the files you are pushing to this PR are formatted and linted properly. Currently, pre-commit is failing due to this.

JamesParrott · 2023-12-14T21:57:31Z

@JamesParrott make sure the files you are pushing to this PR are formatted and linted properly. Currently, pre-commit is failing due to this.

As I said above, this PR is for info only. But sure, no problem. Will do once the currently running actions stop, the tests pass, and I have time.

Is that why the actions runners are taking so long to complete, or is that just Windows and MacOS ones?

jGaboardi · 2023-12-14T21:58:27Z

just Windows and MacOS ones

exactly

JamesParrott added 16 commits December 13, 2023 20:10

Deactivate CI

34c7a25

Add _fisher_jenks_means_without_numpy and Monkey patch code.

2a85d9e

Update README.md

6fac0d2

Monkey patch in _fisher_jenks_means_without_numpy if no Numba (even i…

270d475

…f Numpy)

Only use CPython types.

51a79fb

FisherJenks now accepts _fisher_jenks_means as an injected dependency…

8c44ba0

…. Remove unused HAS_NUMPY check

Add test

b125267

Use sorted, not np.sort.astype. Compare times with and without overhead.

eba2180

Adjust typing in docstring. Correct function name.

583ae81

Revert "Deactivate CI"

1c79a9d

This reverts commit 34c7a25.

Revert README.md to original

81501db

Remove unused entry and unused import

f68bd5e

Tweak to reduce number of changes in PR

9784bfd

Remove whitespace

49a06a9

Fix error when numba present. Add timing test that starts from np arrays

dbb3b8d

Set bins to be a numpy array

0b48854

Change exception if numba present to a UserWarning

9216462

Make zeros an instance method and MockNumpy a Python 3 style class

268c1d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up the "slow", not compiled with numba.jit, _fisher_jenks_means by 4 to 5 times. Saves 3 minutes classifiying 8000 data points. #201

Speed up the "slow", not compiled with numba.jit, _fisher_jenks_means by 4 to 5 times. Saves 3 minutes classifiying 8000 data points. #201

JamesParrott commented Dec 14, 2023

knaaptime commented Dec 14, 2023

JamesParrott commented Dec 14, 2023

jGaboardi commented Dec 14, 2023

JamesParrott commented Dec 14, 2023 •

edited

jGaboardi commented Dec 14, 2023

Speed up the "slow", not compiled with numba.jit, _fisher_jenks_means by 4 to 5 times. Saves 3 minutes classifiying 8000 data points. #201

Are you sure you want to change the base?

Speed up the "slow", not compiled with numba.jit, _fisher_jenks_means by 4 to 5 times. Saves 3 minutes classifiying 8000 data points. #201

Conversation

JamesParrott commented Dec 14, 2023

knaaptime commented Dec 14, 2023

JamesParrott commented Dec 14, 2023

jGaboardi commented Dec 14, 2023

JamesParrott commented Dec 14, 2023 • edited

jGaboardi commented Dec 14, 2023

JamesParrott commented Dec 14, 2023 •

edited