WhisperNote

A simple Python script to Transcribe audio and perform Speaker Diarization using OpenAI's Whisper and pyannote.audio.

Based on Majdoddin's work discussed on GitHub and available as a Google Colab Notebook.

Running the script

This Project was tested only on Linux, using CPU only and GPU configurations. While it is expected to work on other platforms, it is not guaranteed.

Benchmarks

Input File: 10 minutes of audio in .mp3, of an interview between 2 people.

Transcription

CPU: 2.36 minutes GPU: 2.05 minutes

Citations

pyannote/speaker-diarization pyannote/segmentation

@inproceedings{Bredin2021,
  Title = {{End-to-end speaker segmentation for overlap-aware resegmentation}},
  Author = {{Bredin}, Herv{\'e} and {Laurent}, Antoine},
  Booktitle = {Proc. Interspeech 2021},
  Address = {Brno, Czech Republic},
  Month = {August},
  Year = {2021},
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
whispernote		whispernote
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.sample.ini		config.sample.ini
environment.yml		environment.yml
requirements.txt		requirements.txt
whispernote.py		whispernote.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whispernote

whispernote

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

config.sample.ini

config.sample.ini

environment.yml

environment.yml

requirements.txt

requirements.txt

whispernote.py

whispernote.py

Repository files navigation

WhisperNote

Running the script

Benchmarks

Transcription

Citations

About

Releases

Packages

Languages

License

dptools/WhisperNote

Folders and files

Latest commit

History

Repository files navigation

WhisperNote

Running the script

Benchmarks

Transcription

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages