Skip to content

JosePedroDias/leg24

Repository files navigation

TL;DR

I wanted to have transcriptions of the debates and tried to do it myself. It is being a fun ride. Loads of work. 😅 I've been improving the process over this period. Newer debates are likely better transcribed than the initial ones. I haven't got the time to re-review them, with all this content landing each day.

Disclaimer: I try my best to review each debate's SRT (which never takes me less than 45 min on easy ones). It's sometimes very challenging to understand, let alone correct, when multiple people are talking at once. Whisper does an overall good job at this. There were a couple of periods I had to completely write from scratch.

Calendar

5/2

6/2

7/2

8/2

9/2

10/2

11/2

12/2

13/2

14/2

15/2

16/2

17/2

18/2

19/2

20/2

23/2

Process

simpler audio-only grab from podcast (currently used)

PODCAST PROCESS

wget "url" -O 1.mp3
ffmpeg -i 1.mp3 -map 0:a -c:a copy -map_metadata -1 2.mp3
ffmpeg -i 2.mp3 -ss 35 -vcodec copy -acodec copy 3.mp3

wget "url" -O 1.mp4
ffmpeg -i 1.mp4 -map 0:a -c:a copy -map_metadata -1 2.aac
ffmpeg -i 2.aac -ss 20 -codec:a libmp3lame -b:a 128k 3.mp3

video stream grab w/ VLC + FFMPEG to extract aac stream and convert to mp3 (initially used)

  • save m3u8 stream to file on VLC:

  • vlc open network

  • first m3u8...

  • stream output

  • settings

  • file ... asd.ts

  • MPEG TS

  • video to audio without transcoding: ffmpeg -i vlc-output.ts -vn -acodec copy audio.aac

  • aac to mp3: ffmpeg -i audio.aac -acodec mp3 audio.mp3

transcribe mp3 to srt

  • pinokio + whisper webui
  • large v3
  • portuguese
  • toggle off suffix checkbox
  • supply mp3 file and wait...
  • get output from app's output folder

WIP audio analysis

#set INFILE 2024-02-05_pan-chega.mp3
ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 $INFILE
ffmpeg -i $INFILE -lavfi showspectrumpic=s=3622x512 out.png
ffmpeg -i $INFILE -filter_complex "showwavespic=s=14488x512" -frames:v 1 out.png

navigation key bindings

  • space - toggle playback
  • up/down - move to previous/next subtitle
  • left/right - review/fast forward by 15 seconds

onboarding new debates and editing text and speaker tags

For each new debate (an mp3 file), we expect 2 additional files to be created:

  • a subtitles file (srt), which initially comes from running whisper over the mp3
  • a json file listing the speakers and which subtitles indices belong to each speaker the index.json needs to updated to also list the name of this new debate (used in the search features of the main page)

When the site is running locally for editing purposes, node server.mjs should also be running. It changes the file system debate files according to the operations defined in the front end.

There's a set of key bindings for manipulating SRT and JSON files in tandem:

  • joins the current subtitle with either its previous or next one

  • splits the current subtitle by a ratio into 2 new ones

  • edits the current subtitle's text content

  • time tweaks the start and end placements for the current subtitle and its neighbors

  • x deletes the current subtitle

  • f fills the space between the previous subtitle and the current one with a new subtitle

  • 1 assigns the moderator role to the current subtitle (typically gray)

  • 2 assigns the 1st debater role to the current subtitle (typically cyan)

  • 3 assigns the 2nd debater role to the current subtitle (typically magenta)

  • § (before 1, on mac) clears any speaker role from the current subtitle