Skip to content

graham-walker/WhisperPix

Repository files navigation


Add comments to your photos with your voice

INSTALL
CONFIGURATION
BUILDING

WhisperPix uses OpenAI's Whisper speech recognition to accurately transcribe spoken words and save them to a photo's EXIF metadata.

Install

Download the latest release for Windows or Linux here.

The Windows release can also be obtained from the Microsoft Store:
Download WhisperPix

Configuration

Whisper

WhisperPix has the option to transcribe audio with either whisper.cpp (CPU) or OpenAI Whisper (CPU/GPU).

WhisperPix works out of the box by using an embedded version of whisper.cpp v1.4.0 and tiny model (ggml-tiny.bin) by default.

To use OpenAI Whisper it must first be installed on your machine by running:

pip install -U openai-whisper

More detailed installation instructions can be found here.

Models

Larger ggml models for whisper.cpp can be downloaded here and added in WhisperPix's settings menu. 4-bit models are not currently supported.

OpenAI Whisper will automatically download models on first use.

EXIF Tags

The EXIF tags WhisperPix saves to can be changed in the settings menu.

Not every image type supports saving every EXIF tag. See the table below for support of common image types:

EXIF Tag JPEG PNG TIFF WebP GIF MP4
Comment X X
XPComment X X
UserComment
Description
ImageDescription
Caption
Keywords X
XPKeywords X X
Subject X
XPSubject X
Artist
Copyright
Creator
Location
Title
XPAuthor X X
XPTitle X X

SVG & BMP images are not supported.

For comments and keywords to appear in Windows file properties, XPComment should be used instead of Description and XPKeywords instead of Keywords.

Additional Configuration

Additional settings can be modified by directly editing config.json:

# Windows
%USERPROFILE%\AppData\Roaming\whisperpix\config.json

# Linux
~/.config/whisperpix/config.json

Additional choices for EXIF tags and models can be added to WhisperPix's settings menu by editing: modelsAvailable, commentTagsAvailable, keywordsTagsAvailable, tagsAvailable, languagesAvailable.

A comprehensive list of EXIF tags can be found here.

Building

To build the app install Node.js 16 or later and run:

git clone https://github.com/graham-walker/WhisperPix.git

cd ./whisperpix

npm i

# Windows
npm run pack-windows

# Linux
npm run pack-linux

cd ./out

Roadmap

  • Embed whisper.cpp and models
  • Dark aware theme

Licenses

WhisperPix is MIT licensed.

Third-party licenses can be found under /public/LICENSES.html.