Skip to content

syntithenai/voice2llm

Repository files navigation

Voice2LLM

Try Me! [https://chat.syntithenai.com]

This repository provides

  • a fastapi based webserver with endpoints for chat submissions to the LLM. The webserver uses outlines to allow for streaming and the generation of structured json during the decoding phase of inference.
  • a single page React chat client that works with OpenAI, Groqcloud as well as the local LLM webserver.

The web client is designed to work with speech and can be configured to use OpenAI or local servers. Open implementations of the speech to text and text to speech services used in the web client are

Some key features

  • 'Personas' as a way of capturing chat prompts
  • [COMING SOON] 'Teams' of personas who work in various patterns to generate text
  • [COMING SOON] 'Workflows' controlling the flow of generation between teams.

plus

  • voice integration
  • support for rendering mermaid charts and abc music notation
  • saves everything without login and shares between devices if you do

Speech Recognition

https://github.com/syntithenai/whisper-websocket-streaming

Text To Speech

https://github.com/syntithenai/coqui-tts-ssl

Streaming is used throughout to minimise latency.

The default docker-compose.yml file is setup to use GPU resources and automatically fall back to CPU for inference if no GPU is available.

Superquick Start

The UI is hosted on https://chat.syntithenai.com/.

It can be configured with an OpenAI api key or URLs to locally hosted services.

The UI is also hosted locally by the service suite that can be started with Docker.

Quickstart (localhost)

  • (Windows Users) https://docs.docker.com/desktop/gpu/

  • install docker

  • copy the .env.sample file to .env and edit to provide configuration for preferred language model (open ai key, ...)

  • docker-compose up

Custom SSL Domain for external access

  • map a domain name to your IP address and configure port forwarding from your router to ports 443 (web and STT), 444 (LLM) and 5002(TTS)
  • copy the .env.sample file to .env and edit to update the domain name and email address
  • docker-compose up

Building the Frontend UI

cd voice2llm-ui
npm run build

To develop the UI with live updates

cd voice2llm-ui
npm run start

and visit http://localhost:3000

The build script moves the resulting files to the docs file in the root of the project ready for hosting on github.