Transcription Web App

The project consists of a FastAPI server (server.app) and a frontend component with Python scripts (front-end) for various audio-related operations. It also utilizes AWS S3 for storing audio files and includes some external services such as VAD (Voice Activity Detection) and ASR (Automatic Speech Recognition).

Getting Started

Speech Models

Voice Activity Detection

Download release from https://github.com/snakers4/silero-vad/releases
Unzip it to backend/app/ml_models/vad
Copy files from files/ to vad/
Update utils.py from the VAD repo, if required.

ASR speech-to-text

Any ASR model can be instantiated by implementing the class ASR from backend/app/src/asr.py

Running locally

The project can be run locally using Localstack to simulate the creation of the AWS resources.

docker-compose up

Access http://localhost:8501/ to see the website.

License

This project is open-source and available under the MIT License. You are free to use, modify, and distribute it as per the terms of the license.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
infra		infra
lambda		lambda
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create_prompt.sh		create_prompt.sh
create_stacks.sh		create_stacks.sh
delete_stacks.sh		delete_stacks.sh
docker-compose.yml		docker-compose.yml
init-aws.sh		init-aws.sh
sample_message.json		sample_message.json

License

gabrielziegler3/transcription-webapp

Folders and files

Latest commit

History

Repository files navigation

Transcription Web App

Getting Started

Speech Models

Voice Activity Detection

ASR speech-to-text

Running locally

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages