Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up StableTopologicalSort #131

Merged
merged 1 commit into from Jun 6, 2023
Merged

Conversation

jonjohnsonjr
Copy link
Contributor

@jonjohnsonjr jonjohnsonjr commented Jun 5, 2023

Fixes #129

The biggest savings come from just calling sort less often. Instead of in the inner-most loop, we call it after processing the whole map of predecessors.

The other portion is just not appending duplicate items to the queue. I do this in a simple way by maintaining a set of queued things (similar to the visited set).

I think the "proper" way to fix this would be to use a sorted list for queue instead of sorting after every pass, but I'm too lazy to implement that, and I don't think it's part of the go stdlib.

For our use case, on my machine, StableTopologicalSort took:

Before: 230.00s
After:    0.03s

So about 8000x faster.

We were doing ~50 billion string comparisons before, and we're doing 242642 now. I think that's still too high, but I'm willing to call that fine.

The biggest savings come from just calling sort less often. Instead of
in the inner-most loop, we call it after processing the whole map of
predecessors.

The other portion is just not appending duplicate items to the queue. I
do this in a simple way by maintaining a set of queued things (similar
to the visited set).

I think the "proper" way to fix this would be to use a sorted list for
queue instead of sorting after every pass, but I'm too lazy to implement
that, and I don't think it's part of the go stdlib.

For our use case, on my machine, StableTopologicalSort took:

Before: 230.00s
After:    0.03s

Signed-off-by: Jon Johnson <jon.johnson@chainguard.dev>
@jonjohnsonjr jonjohnsonjr changed the title StableTopologicalSort: don't eagerly sort queue Speed up StableTopologicalSort Jun 5, 2023
@dominikbraun dominikbraun self-requested a review June 6, 2023 05:40
Copy link
Owner

@dominikbraun dominikbraun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I didn't have the time to investigate the performance issue yet, so any help is really welcome. I've also thought about calling sort.Slice less often, and maintaining a map of already-added items makes sense as well.

I think the "proper" way to fix this would be to use a sorted list for queue instead of sorting after every pass, but I'm too lazy to implement that, and I don't think it's part of the go stdlib.

There even is a priority queue implementation in this library. It maintains a binary heap for sorting the items and is currently being used for Dijkstra's algorithm – maybe something like that would be a good fit.

Anyways, I guess an 8000x speed-up is worthy of a new patch release :-)

@deitch
Copy link

deitch commented Jun 6, 2023

I am really glad @jonjohnsonjr hopped onto this, and I appreciate it as well. He is one superstar, especially in performance related areas (I think he enjoys it too 😄 ).

@dominikbraun dominikbraun merged commit daffba1 into dominikbraun:main Jun 6, 2023
1 check passed
@deitch
Copy link

deitch commented Jun 6, 2023

Hooray! I will run it as well. Will this be v0.22.2 or in v0.23.0?

@dominikbraun
Copy link
Owner

@deitch v0.22.2, I'm going to release it in a few minutes.

@deitch
Copy link

deitch commented Jun 6, 2023

My use case collapsed from 4-6 mins down to 37 seconds. @jonjohnsonjr is 🧙‍♂️

Thanks to both of you.

@dominikbraun
Copy link
Owner

This change has been released in v0.22.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stable sort has major performance degradation
4 participants