Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical runs too slow #140

Open
cheesebear opened this issue Dec 7, 2021 · 1 comment
Open

Hierarchical runs too slow #140

cheesebear opened this issue Dec 7, 2021 · 1 comment

Comments

@cheesebear
Copy link

cheesebear commented Dec 7, 2021

Hierarchical seems to run np.min too many times. I suggest apply stack() to dists and sort the dataframe first, to avoid running min repeatedly.

@wannesm
Copy link
Owner

wannesm commented Dec 20, 2021

You are correct that, this method can be further optimized and the usage of min is not ideal. Do you have a worked out solution? I assume you imply argsort instead of sort?

For fast clustering implementations I recommend to use the Scipy and Pyclustering implementations. They are already wrapped in clustering.hierarchical.LinkageTree and clustering.medoids.Medoids, respectively. And by looking at the code in those classes, one can use almost all other clustering techniques in the other toolbox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants