🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
Updated
Jun 1, 2024 - Python
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
SharpSpeech is free, local and open source way to speech and wake word recognition.
An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and other AI models
Production First and Production Ready End-to-End Speech Recognition Toolkit
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Fully Functional Voice Based Natural Language UI
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
This project implements a Speech Emotion Recognition (SER) model using TensorFlow Lite, specifically designed for deployment on microcontrollers like the Arduino Nano BLE33. The model is trained on the RAVDESS dataset and can recognize seven emotions: Angry, Disgust, Fear, Happy, Neutral, Sad, and Surprise.
Official Python SDK for Deepgram's automated speech recognition APIs.
Go SDK for Deepgram's automated speech recognition APIs.
🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser
A voice-operated emailing mobile application that allows you to compose and send email messages through voice commands.
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Monorepo for Transcribe.js
Tools for handling speech data in machine learning projects.
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.
HTML Web template that can RECOGNIZE any live audio/video streaming (using Chrome webkitSpeechRecognition API) then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE
Add a description, image, and links to the speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speech-recognition topic, visit your repo's landing page and select "manage topics."