You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hello @juanmc2005
I use the hbredin/wespeaker-voxceleb-resnet34-LM (ONNX) model to extract speaker embedding in diarization pipeline, but I found the latency is too large(1300ms) when calculate per chunk with the default params (chunk=5s, step=0.5s, latency=0.5), this can not meet the real time requirement.
I found you post the delay performance is 48ms when use cpu and 15ms use gpu. Is there anything I need to pay attention to when reproducing your performance。
Thank you very much for any suggestions
The text was updated successfully, but these errors were encountered:
SheenChi
changed the title
The delacy of wespeaker model is to large
The delatency of wespeaker model is to large
Dec 21, 2023
SheenChi
changed the title
The delatency of wespeaker model is to large
The latency of wespeaker model is to large
Dec 21, 2023
Hi @SheenChi, the values I reported were obtained from the output of diart.stream with my hardware: CPU AMD Ryzen 9 and GPU Nvidia RTX 4060 Max-Q.
If you find the model too slow on your hardware you can try using pyannote/embedding, which is the fastest one. If that's still not enough you could try quantizing a model you like or distilling it into a smaller model. Depending on your hardware, I think distillation would be my preferred choice as a first step, but it requires training.
For training I recommend you use pyannote.audio, as it's very reliable for this use case and would give you instant compatibility with diart
hello @juanmc2005
I use the hbredin/wespeaker-voxceleb-resnet34-LM (ONNX) model to extract speaker embedding in diarization pipeline, but I found the latency is too large(1300ms) when calculate per chunk with the default params (chunk=5s, step=0.5s, latency=0.5), this can not meet the real time requirement.
I found you post the delay performance is 48ms when use cpu and 15ms use gpu. Is there anything I need to pay attention to when reproducing your performance。
Thank you very much for any suggestions
The text was updated successfully, but these errors were encountered: