Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorise matrix similarity functions #638

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

tornikeo
Copy link
Contributor

@tornikeo tornikeo commented May 3, 2024

Addressing #637. I rewrote the njitted functions using plain matmul operations, using numpy. This is around x50 times faster (2s for 1k pairwise 1k peaks, vs 40ms).

@niekdejonge
Copy link
Collaborator

@florian-huber Seems like a great change to me, but I am less familiar with this part of the code base. Could you check and merge this PR?

@niekdejonge
Copy link
Collaborator

@tornikeo please feel free to add yourself to the citation.cff file, btw!

@tornikeo
Copy link
Contributor Author

Oh, wait a sec... Once I remove pytorch matmul from the picture, numpy seems only around 2x faster, not 50x times faster. This is all on CPU. It's weird to have such a difference between cpu-pytorch and numpy.

Still investigating this. Just a 2x acceleration might not be worth more obscure code.

@niekdejonge
Copy link
Collaborator

Hmm yes, surprising that the difference is that large. Still 2x acceleration can be worth it, I will leave it up to @florian-huber. Let us know what comes out of the investigation :)

@tornikeo
Copy link
Contributor Author

Found the culprit. It's because numpy only uses BLAS for floaty types. There's no integer support for BLAS matmul. Still numpy is slower than torch. With this, we arrive to around 10x speedup.

I've also expanded the comparison notebook to compare all three approaches (numpy, torch and njit).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants