Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
-
Updated
May 30, 2024 - Jupyter Notebook
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
The codebase for Data-driven general-purpose voice activity detection.
Speaker change detection using SincNet and an LSTM/Transformer
Detecting depressed Patient based on Speech Activity, Pauses in Speech and Using Deep learning Approach
The Voxseg implementation in PyTorch. Voxseg is a python library for voice activity detection (VAD) for speech/non-speech segmentation.
Fork of the official kaldi.
PyAnnote Voice Activity Detection (ONNX version)
Scoring Toolkit for the Fearless Steps Challenge Phase-02 Tasks
A curated list of awesome voice activity detection
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Voice activity detection and speaker gender segmentation audiovisual corpus
Lightweight speech-to-speech web-based chat app combining speech recognition, LLM completion and text-to-speech. Implemented with Python (Flask) and vanilla JavaScript only.
Add a description, image, and links to the speech-activity-detection topic page so that developers can more easily learn about it.
To associate your repository with the speech-activity-detection topic, visit your repo's landing page and select "manage topics."