Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add Clusterer class + cluster module #103

Open
2 tasks
NickleDave opened this issue Jan 15, 2024 · 1 comment
Open
2 tasks

ENH: Add Clusterer class + cluster module #103

NickleDave opened this issue Jan 15, 2024 · 1 comment
Labels
ENH: enhancement New feature or request

Comments

@NickleDave
Copy link
Contributor

NickleDave commented Jan 15, 2024

Clusterers gonna cluster.

People will want this; it's a feature of e.g. pykanto, songexplorer, koe, voice and frequently appears in papers, see e.g. https://royalsocietypublishing.org/doi/full/10.1098/rsos.231713.

So we should add

  • a cluster module where specific algorithms will live
  • a Clusterer class that makes it easy to parallelize and makes code readable

The goal should be minimal magic and replication of existing algorithms where possible.

My impression is that a common way to cluster is to use UMAP and HDBSCAN in some combination, as in pykanto and as in the rook paper above.
So an initial implementation would add dependencies on UMAP + HDBSCAN, and allow direct access to their parameters as in nilomr/pykanto#30.

But I'm worried about the number of dependencies we have already, and would rather limit dependencies to core scientific Python if possible. Long term we might want to vendor e.g. in a vocalpy.cluster._vendor sub-sub-package.

Also we (I) need to actually understand all the parameters involved (see nilomr/pykanto#32 (comment)). We shouldn't add this if we can't provide tutorials with suggestions on what parameters to use for diff't datasets.
We want something like https://github.com/marathomas/tutorial_repo.

And we will want to clearly document assumptions and any work related to caveats.

@NickleDave NickleDave added the ENH: enhancement New feature or request label Jan 15, 2024
@NickleDave
Copy link
Contributor Author

NickleDave commented Jan 17, 2024

Note: anything semi-automated, i.e. that requires a GUI, is out of scope for VocalPy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENH: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant