Adapt "step" automagically #206

hbredin · 2023-11-10T14:08:25Z

step controls the minimum algorithmic latency of the speaker diarization pipeline.

Targetting real-time processing, one needs to make sure that the processing latency (i.e. the time it takes to process one step) is smaller than this algorithmic latency.

Said differently: the lower bound on the algorithmic latency is the processing latency, which in turns, depends on the computing power of the machine the pipeline runs on (e.g. GPU is usually faster than CPU).

Would be nice to provide an API to automatically estimate this lower bound by running a few steps when pipeline is instantiated and measuring the time it takes so that step can be set automatically to processing latency + a little safety net.

The text was updated successfully, but these errors were encountered:

juanmc2005 · 2023-11-10T15:24:56Z

I like it, that way we can automatically set the lowest possible latency. This could be implemented as --step auto, but also somewhere in the python API

juanmc2005 · 2023-11-13T14:25:34Z

Another idea: Implement this as a diart.profile recording.wav that also runs a quick grid search on that file to suggest hyper-parameter values without running a costly tuning.

This would be useful for people that don't have much data but have a "typical" conversation that the system will encounter. Then diart would quickly suggest a config to get started.

hbredin added the feature New feature or request label Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt "step" automagically #206

Adapt "step" automagically #206

hbredin commented Nov 10, 2023

juanmc2005 commented Nov 10, 2023

juanmc2005 commented Nov 13, 2023

Adapt "step" automagically #206

Adapt "step" automagically #206

Comments

hbredin commented Nov 10, 2023

juanmc2005 commented Nov 10, 2023

juanmc2005 commented Nov 13, 2023