A tool to transcribe video to audio using Whisper API for thematic analysis.
Attention weights to capture dependencies and interactions between words in the audio.
The following is split up into a few sections: (1) Focused on transcription and performing speaker diarization, meaning separating speakers in the audio (2) Thematic analysis of the transcription (in progress)
- Set up Whisper API GitHub instructions
If you do not have admin access, make sure to run as administrator
when installing
- Install dependencies in your command prompt
pip install whisper
pip install python-docx
pip install fpdf
-
Add audio file to the same directory or add the correct path in code
-
Modify code as necessary such as changing the name of the audio file, language, etc.
-
Run the script. Replace
your_script.py
with the name of your script
python your_script.py
- See saved transcription as a word (.docx) and pdf (.pdf) file in the same directory as the audio file
The acceptable file types1 are:
m4a
: MPEG-4 Audio Filemp3
: MPEG-1 Audio Layer 3 Filewebm
: WebM Audio/Video Filemp4
: MPEG-4 Video Filempga
: MPEG Audio Filewav
: Waveform Audio Filempeg
: MPEG Movie File
Links to audio files are not supported at the moment.
50 requests per minute1
Up to 25MB1
cd
into the directory where the audio file is locatedwhisper transcribe --filename <filename> --language <language> --output <output_filename>
filename
is the name of the file you want to transcribelanguage
is the language of the audio fileoutput
is the name of the file you want to save the transcription as
cd
into the directory where the audio file is locatedwhisper transcribe --filename <filename> --language <language> --output <output_filename> --speaker-separation
filename
is the name of the file you want to transcribelanguage
is the language of the audio fileoutput
is the name of the file you want to save the transcription as
cd
into the directory where the audio file is locatedwhisper transcribe --filename <filename> --language <language> --output <output_filename> --speaker-separation --speaker-labels
filename
is the name of the file you want to transcribelanguage
is the language of the audio fileoutput
is the name of the file you want to save the transcription as
cd
into the directory where the audio file is locatedwhisper transcribe --filename <filename> --language <language> --output <output_filename> --speaker-separation --speaker-labels --punctuation
filename
is the name of the file you want to transcribelanguage
is the language of the audio fileoutput
is the name of the file you want to save the transcription as
cd
into the directory where the audio file is locatedwhisper transcribe --filename <filename> --language <language> --output <output_filename> --speaker-separation --speaker-labels --punctuation --profanity-filter
filename
is the name of the file you want to transcribe