QuiLLMan: Voice Chat with LLMs

A complete chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.

This repo is meant to serve as a starting point for your own language model-based apps, as well as a playground for experimentation. Contributions are welcome and encouraged!

The language model used is Zephyr. OpenAI Whisper is used for transcription, and Metavoice Tortoise TTS is used for text-to-speech. The entire app, including the frontend, is made to be deployed serverlessly on Modal.

You can find the demo live here.

[Note: this code is provided for illustration only; please remember to check the license before using any model for commercial purposes.]

File structure

React frontend (src/frontend/)
FastAPI server (src/app.py)
Whisper transcription module (src/transcriber.py)
Tortoise text-to-speech module (src/tts.py)
Zephyr language model module (src/llm_zephyr.py)

Read the accompanying docs for a detailed look at each of these components.

Developing locally

Requirements

modal installed in your current Python virtual environment (pip install modal)
A Modal account
A Modal token set up in your environment (modal token new)

Develop on Modal

To serve the app on Modal, run this command from the root directory of this repo:

modal serve src.app

In the terminal output, you'll find a URL that you can visit to use your app. While the modal serve process is running, changes to any of the project files will be automatically applied. Ctrl+C will stop the app.

Deploy to Modal

Once you're happy with your changes, deploy your app:

modal deploy src.app

[Note that leaving the app deployed on Modal doesn't cost you anything! Modal apps are serverless and scale to 0 when not in use.]

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
src		src
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

.isort.cfg

.isort.cfg

.pre-commit-config.yaml

.pre-commit-config.yaml

LICENSE

LICENSE

README.md

README.md

Repository files navigation

QuiLLMan: Voice Chat with LLMs

File structure

Developing locally

Requirements

Develop on Modal

Deploy to Modal

About

Releases

Packages

Contributors 10

Languages

License

modal-labs/quillman

Folders and files

Latest commit

History

Repository files navigation

QuiLLMan: Voice Chat with LLMs

File structure

Developing locally

Requirements

Develop on Modal

Deploy to Modal

About

Topics

Resources

License

Stars

Watchers

Forks

Languages