A Comprehensive Survey of Mamba in Deep Learning
-
Updated
Jun 3, 2024
A Comprehensive Survey of Mamba in Deep Learning
Speaker identification on audio files using the pyannote/embedding model.
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
Speech, Language, Audio, Music Processing with Large Language Model
Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
A PyTorch-based Speech Toolkit
General Speech Restoration
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Python library for converting numbers to words for all Indian Languages.
Implementation of [Librosa](https://github.com/librosa/librosa) like [STFT](https://en.wikipedia.org/wiki/Short-time_Fourier_transform) using [FFTW](https://www.fftw.org/)
🔉 spafe: Simplified Python Audio Features Extraction
Python implementation of a few speech intelligibility prediction algorithms
AI powered speech denoising and enhancement
A suite of speech signal processing tools
Automated Reproducible Acoustical Analysis
Source code of the paper "MARTA: a model for the automatic phonemic grouping of the parkinsonian speech"
😎 Awesome lists about Speech Emotion Recognition
✅ A list of speech recognition learning resources including courses, books, tutorials, papers and toolkits.
Multimodal Emotion eXpression Capture Amsterdam. Pipeline for capturing emotion expressions from multiple modalities (video, audio, text) in the wild.
Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.
To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."