A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
Updated
May 14, 2024 - Python
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A PyTorch-based Speech Toolkit
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
🔈 Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
In defence of metric learning for speaker recognition
SincNet is a neural architecture for efficiently processing raw audio samples.
an open-source implementation of sequence-to-sequence based speech processing engine
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)
Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Identifying people from small audio fragments
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
使用Tensorflow实现声纹识别
Add a description, image, and links to the speaker-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speaker-recognition topic, visit your repo's landing page and select "manage topics."