Memory efficient HDBSCAN for sparse data possible? #28276
KukumavMozolo
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi there,
I realized that HDBSCAN doesn't support using any method other than brute force to find clusters with sparse data.
Therefore it computes the distance matrix with 0(n²) memory, witch quickly results in an out of memory error.
So my question: what are the reasons that more memory efficient techniques like e.g. kd-tree cant be used for sparse data and are there ways around it?
Beta Was this translation helpful? Give feedback.
All reactions