Voice2LLM

This repository provides

a fastapi based webserver with endpoints for chat submissions to the LLM. The webserver uses outlines to allow for streaming and the generation of structured json during the decoding phase of inference.
a single page React chat client that works with OpenAI, Groqcloud as well as the local LLM webserver.

The web client is designed to work with speech and can be configured to use OpenAI or local servers. Open implementations of the speech to text and text to speech services used in the web client are

Some key features

'Personas' as a way of capturing chat prompts
[COMING SOON] 'Teams' of personas who work in various patterns to generate text
[COMING SOON] 'Workflows' controlling the flow of generation between teams.

plus

voice integration
support for rendering mermaid charts and abc music notation
saves everything without login and shares between devices if you do

Speech Recognition

https://github.com/syntithenai/whisper-websocket-streaming

Text To Speech

https://github.com/syntithenai/coqui-tts-ssl

Streaming is used throughout to minimise latency.

The default docker-compose.yml file is setup to use GPU resources and automatically fall back to CPU for inference if no GPU is available.

Superquick Start

The UI is hosted on https://chat.syntithenai.com/.

It can be configured with an OpenAI api key or URLs to locally hosted services.

The UI is also hosted locally by the service suite that can be started with Docker.

Quickstart (localhost)

(Windows Users) https://docs.docker.com/desktop/gpu/
install docker
copy the .env.sample file to .env and edit to provide configuration for preferred language model (open ai key, ...)
docker-compose up

Custom SSL Domain for external access

map a domain name to your IP address and configure port forwarding from your router to ports 443 (web and STT), 444 (LLM) and 5002(TTS)
copy the .env.sample file to .env and edit to update the domain name and email address
docker-compose up

Building the Frontend UI

cd voice2llm-ui
npm run build

To develop the UI with live updates

cd voice2llm-ui
npm run start

and visit http://localhost:3000

The build script moves the resulting files to the docs file in the root of the project ready for hosting on github.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
app		app
docs		docs
tts_server		tts_server
voice2llm-ui		voice2llm-ui
.env.sample		.env.sample
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
book-open-line.png		book-open-line.png
buildui.sh		buildui.sh
docker-compose.yml		docker-compose.yml
oldDockerfile		oldDockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

docs

docs

tts_server

tts_server

voice2llm-ui

voice2llm-ui

.env.sample

.env.sample

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE.txt

LICENSE.txt

README.md

README.md

book-open-line.png

book-open-line.png

buildui.sh

buildui.sh

docker-compose.yml

docker-compose.yml

oldDockerfile

oldDockerfile

Repository files navigation

Voice2LLM

Speech Recognition

Text To Speech

Superquick Start

Quickstart (localhost)

Custom SSL Domain for external access

Building the Frontend UI

About

Languages

License

syntithenai/voice2llm

Folders and files

Latest commit

History

Repository files navigation

Voice2LLM

Speech Recognition

Text To Speech

Superquick Start

Quickstart (localhost)

Custom SSL Domain for external access

Building the Frontend UI

About

Topics

Resources

License

Stars

Watchers

Forks

Languages