New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating subtitles using whisper on youtube and immediately using them #397
Comments
Im doing something similar i have made this python script that monitors my clipboard for youtube links and then downloads the video audio and converts it and transcribes it with the reazonspeech model Scriptfrom pathlib import Path
def sexagesimal(secs):
mm, ss = divmod(secs, 60)
hh, mm = divmod(mm, 60)
return f'{hh:0>2.0f}:{mm:0>2.0f}:{ss:0>6.3f}'
import io
import subprocess
import numpy as np
def convert_webm_to_mp3(webm_buffer):
# Run ffmpeg as a subprocess
process = subprocess.Popen([
'ffmpeg',
'-i', '-', # Input from stdin
'-vn', # Disable video
'-ac', '1', '-ar', '16k',
'-acodec', 'pcm_s16le', # Set audio codec to mp3
'-f', 's16le', # Set output format to mp3
'-'], # Output to stdout
stdin=subprocess.PIPE, # Redirect stdin to the webm buffer
stdout=subprocess.PIPE, # Capture stdout
stderr=subprocess.PIPE # Capture stderr
)
# Write the webm buffer to ffmpeg's stdin
stdout, stderr = process.communicate(input=webm_buffer.read())
# Check for errors
if process.returncode != 0:
raise RuntimeError(f'ffmpeg error: {stderr.decode()}')
# Return the converted audio as bytes
result = np.frombuffer(stdout, np.int16).astype(np.float32) / 32768.0
return audio_from_numpy(result, 16000)
from reazonspeech.nemo.asr import load_model, transcribe, audio_from_path
import torch
print(f'Has Cuda? {torch.cuda.is_available()}')
print("Loading Model...")
model = load_model()
print("Finished")
import gc
gc.collect()
import time
from datetime import datetime
from pytube import YouTube
from reazonspeech.nemo.asr import audio_from_numpy
import pyperclip
import re
YOUTUBE_REGEX = r'(https?://)?(www\.)?(youtube\.com|youtu\.?be)/.+$'
def is_youtube_link(text):
return re.match(YOUTUBE_REGEX, text) is not None
#Ignore whatever is currently in the clipboard when the program starts
video_url = pyperclip.paste()
while True:
# video_url = input("Youtube link: ")
print("Waiting for youtube url in clipboard...")
while True:
clipboard = pyperclip.paste()
if clipboard != video_url:
# check if video url is a valid youtube video_url
video_url = clipboard
if is_youtube_link(video_url):
subprocess.run(['notify-send', '-u', 'normal', '-a', "AI Youtube Subtitles", "-t", "10000", f"Starting processing on {video_url}"])
break
#Poll every 100ms
time.sleep(0.1)
startTime = time.monotonic()
buffer = io.BytesIO()
print("Downloading audio..")
yt = YouTube(video_url)
yt.streams.filter(only_audio=True).order_by('abr')[-1].stream_to_buffer(buffer)
print("Done")
buffer.seek(0);
print("Converting..")
audio = convert_webm_to_mp3(buffer);
print("Done")
s = time.monotonic()
transcription = transcribe(model, audio)
e = time.monotonic()
print('Finished transcription took {:0.2f}s'.format(e - s))
r = 'WEBVTT\n\n' + '\n\n'.join([f"{sexagesimal(seg.start_seconds)} --> {sexagesimal(seg.end_seconds)}\n{seg.text}" for seg in transcription.segments])
# print(r)
with Path('out.vtt').open("w", encoding="utf8") as o:
o.write(r)
endTime = time.monotonic()
print("Done: out.vtt")
timeStr = '{:0.2f}s'.format(endTime - startTime);
# Generate notification virker kun på linux xd
subprocess.run(['notify-send', '-u', 'normal', '-a', "AI Youtube Subtitles", "-t", "20000", f"Finished {video_url} in {timeStr}"])
gc.collect() It outputs the file to out.vtt i wonder is there a way to easily send the subtitle file directly to asbplayer? so i dont have to manually drag it |
Loading subtitles could be an addition to asbplayer's web socket interface |
That would be awesome if you added that |
Is your feature request related to a problem? Please describe.
Youtube generated subs are really bad. The whisper subs are more accurate. Right now I'm downloading the video, converting it to audio, running whisper to produce srt file and load it to asb player.
Describe the solution you'd like
The process of downloading, conversion, transcription and loading the srt file can be managed by asbplayer. The whisper subs would have been available in one click
The text was updated successfully, but these errors were encountered: