Version 3.1.0

hbredin released this 16 Nov 12:37

· 62 commits to develop since this release

TL;DR

pyannote/speaker-diarization-3.1 no longer requires unpopular ONNX runtime

Full changelog

New features

feat(model): add WeSpeaker embedding wrapper based on PyTorch
feat(model): add support for multi-speaker statistics pooling
feat(pipeline): add TimingHook for profiling processing time
feat(pipeline): add ArtifactHook for saving internal steps
feat(pipeline): add support for list of hooks with Hooks
feat(utils): add "soft" option to Powerset.to_multilabel

Fixes

fix(pipeline): add missing "embedding" hook call in SpeakerDiarization
fix(pipeline): fix AgglomerativeClustering to honor num_clusters when provided
fix(pipeline): fix frame-wise speaker count exceeding max_speakers or detected num_speakers in SpeakerDiarization pipeline

Improvements

improve(pipeline): compute fbank on GPU when requested

Breaking changes

BREAKING(pipeline): rename WeSpeakerPretrainedSpeakerEmbedding to ONNXWeSpeakerPretrainedSpeakerEmbedding
BREAKING(setup): remove onnxruntime dependency.
You can still use ONNX hbredin/wespeaker-voxceleb-resnet34-LM but you will have to install onnxruntime yourself.
BREAKING(pipeline): remove logging_hook (use ArtifactHook instead)
BREAKING(pipeline): remove onset and offset parameter in SpeakerDiarizationMixin.speaker_count
You should now binarize segmentations before passing them to speaker_count

Assets 2