Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very big efficiency improvement #11

Open
zegres opened this issue Apr 15, 2021 · 3 comments
Open

Very big efficiency improvement #11

zegres opened this issue Apr 15, 2021 · 3 comments

Comments

@zegres
Copy link

zegres commented Apr 15, 2021

I have been using geovoronoi for a while.
Recently I had to make voronoi polygons for 30K or more points and geovoronoi got way too slow for my purposes.
I tried PyGEOS, which has a voronoi_polygons function that is actually faster than scipy's, but there is no way that I found to refer back to the original points without doing a spatial join.
So I decided to try my hand at using scipy's scipy.spatial.Voronoi myself (like geovoronoi).
I quickly ran into the same place where geovoronoi gets all its runtime and complexity - dealing with infinite ridges/polygons.
Now, I had a big breakthrough.
I just add four more points to the voronoi far away - this way the infinte polygons are all oustide the bounding geometry.
Then every polygon I care about is finite and I just ignore the last 4.
I get the polygons now in 300ms vs 5 seconds using geovoronoi.

I'll clean up my code and put it here, but the idea is simple.

example

@internaut
Copy link
Member

Well, you may try but I'm a bit skeptical about such "hacks". They often enough only work in certain circumstances and break with edge cases. Make sure to run all tests and examples in the project.

@zegres
Copy link
Author

zegres commented May 6, 2021

Completely agree with you in principal, hacks like this can break with edge cases.
And this method will introduce a small amount of error to the angles of infinite ridges.
Three things in defense of this strategy:

  • The internal ridges and polygons will 100% match the normal approach.
  • The error in the angles of the infinite ridges will scale with the radius of the containing circle and the number of points in that circle. This is mathematically provable - no edge cases.
  • This method is 1-2 orders of magnitude faster than the current approach - it seems many people would benefit form having this - at least as an alternative. (For me this was a case of usable/not usable since I had to run for thousands of points thousands of times. The code I am using now takes 30 minutes vs 6 days)
    Please let me know if you would like me to add this to the project.

@internaut
Copy link
Member

You're of course very welcome to contribute this approach and I think it would be a great benefit if it works!

As I said, the tests in the project must pass and examples must run. If that's the case, I'm willing to consider to include it into the package.

At the moment, I think "phase 2" of the current algorithm is often slow as it tries to "fill up" even very tiny portions of polygons. One could also include an amount of "acceptable error" here, which would also make it faster. But that would be another approach -- maybe you can test that too if you like, I currently don't have the time for this. Would that reduce your computation time for your project?

The main problem I'm seeing with your approach is that it can more easily lead to numerical issues, e.g. when you have one cluster of points, but then one more point that is very far from the cluster. Your approach needs to add the four points on the containing circle with a certain radius as additional parameter (which is also a bit unfortunate since I'd like to use as few parameters as possible). If there's this one point very far away, the circle may have a very large radius which may cause numerical overflows. If you test your approach, please consider this and add some appropriate tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants