ENH: Add Clusterer class + cluster module #103

NickleDave · 2024-01-15T23:25:11Z

Clusterers gonna cluster.

People will want this; it's a feature of e.g. pykanto, songexplorer, koe, voice and frequently appears in papers, see e.g. https://royalsocietypublishing.org/doi/full/10.1098/rsos.231713.

So we should add

a cluster module where specific algorithms will live
a Clusterer class that makes it easy to parallelize and makes code readable

The goal should be minimal magic and replication of existing algorithms where possible.

My impression is that a common way to cluster is to use UMAP and HDBSCAN in some combination, as in pykanto and as in the rook paper above.
So an initial implementation would add dependencies on UMAP + HDBSCAN, and allow direct access to their parameters as in nilomr/pykanto#30.

But I'm worried about the number of dependencies we have already, and would rather limit dependencies to core scientific Python if possible. Long term we might want to vendor e.g. in a vocalpy.cluster._vendor sub-sub-package.

Also we (I) need to actually understand all the parameters involved (see nilomr/pykanto#32 (comment)). We shouldn't add this if we can't provide tutorials with suggestions on what parameters to use for diff't datasets.
We want something like https://github.com/marathomas/tutorial_repo.

And we will want to clearly document assumptions and any work related to caveats.

The text was updated successfully, but these errors were encountered:

NickleDave · 2024-01-17T15:37:34Z

Note: anything semi-automated, i.e. that requires a GUI, is out of scope for VocalPy.

NickleDave added the ENH: enhancement New feature or request label Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add Clusterer class + cluster module #103

ENH: Add Clusterer class + cluster module #103

NickleDave commented Jan 15, 2024 •

edited

NickleDave commented Jan 17, 2024 •

edited

ENH: Add Clusterer class + cluster module #103

ENH: Add Clusterer class + cluster module #103

Comments

NickleDave commented Jan 15, 2024 • edited

NickleDave commented Jan 17, 2024 • edited

NickleDave commented Jan 15, 2024 •

edited

NickleDave commented Jan 17, 2024 •

edited