Demo: https://youtu.be/RjTScgSH_Fg
Clone this repo, and setup and activate a virtualenv:
python3 -m pip install virtualenv
python3 -m virtualenv venv
source venv/bin/activate
Install the dependencies:
pip install -r requirements.txt
Create an empty .env
file.
Create an OpenAI account. Create a secret API key, copy the key and add it to your .env
file. PS: Remember to buy credits for your account.
OPENAI_API_KEY=<token>
Create an ElevenLabs account. Create a secret API key, copy the key and add it to your .env
file.
ELEVENLABS_API_KEY=<eleven-token>
Make a new voice in Eleven and get the voice id of that voice using their get voices API, or by clicking the flask icon next to the voice in the VoiceLab tab.
If you want to clone a voice, you can use their Instant Voice Cloning feature. You need at least 3 minutes of clear audio of the voice you want to clone.
Play around with the voice settings to get the voice to sound how you want it to.
Add the voice id to your .env
file:
ELEVENLABS_VOICE_ID=<voice-id>
If you want to take screenshots of your browser, you need to:
- On your screen, have your terminal and browser window in the same screen. On a Mac, you just need to drag one screen into the other one (one of the screens needs to be full screen for this to work).
- Get the title of the browser window you want to capture. You can do this by hovering your mouse over the browser tab and you will see the window title.
In one terminal, run the browser capture:
python browser_capture.py "Your window title"
In another terminal, run the narrator:
python narrator.py
If you want the screenshots from your webcam instead then:
In one terminal, run the webcam capture:
python webcam_capture.py
In another terminal, run the narrator:
python narrator.py