Vram Limit #135

johncadengo · 2023-03-29T22:06:22Z

Describe the feature you'd like to request

Hi, I'm running into an issue where I cannot load the large model. I have 2 GPUs, each with 8GB Vram. I'm wondering if it's possible to split the work between 2 GPUs?

If not, is it possible to disable the large model from the UX side and default to another size, like medium?

Describe the solution you'd like

No response

johncadengo · 2023-03-29T22:29:04Z

Ok, I found something of a solution here from Whisper's repo: openai/whisper#360 (comment)

I was able to alter transcriber.py to accommodate for this (original file: https://github.com/schibsted/WAAS/blob/main/src/transcriber.py):

from typing import Any

import whisper
from sentry_sdk import set_user

def transcribe(filename: str, requestedModel: str, task: str, language: str, email: str, webhook_id: str) -> dict[str, Any]:
    # Mail is not used here, but it makes it easier for the worker to log mail
    print("Executing transcribing of " + filename + " for "+(email or webhook_id) + " using " + requestedModel + " model ")
    set_user({"email": email})

    # Hack: https://github.com/openai/whisper/discussions/360#discussioncomment-4156445
    model = whisper.load_model(requestedModel, device="cpu")

    model.encoder.to("cuda:0")
    model.decoder.to("cuda:1")

    model.decoder.register_forward_pre_hook(lambda _, inputs: tuple([inputs[0].to("cuda:1"), inputs[1].to("cuda:1")] + list(inputs[2:])))
    model.decoder.register_forward_hook(lambda _, inputs, outputs: outputs.to("cuda:0"))

    return model.transcribe(filename, language=language, task=task)

And I just had to make sure that docker exposed both GPUs to the container.

Might be interesting to have this officially supported, but offering my solution here for anyone who's interested.

johncadengo added the type/feature Issue or PR related to a new feature label Mar 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vram Limit #135

Vram Limit #135

johncadengo commented Mar 29, 2023

johncadengo commented Mar 29, 2023

Vram Limit #135

Vram Limit #135

Comments

johncadengo commented Mar 29, 2023

Describe the feature you'd like to request

Describe the solution you'd like

johncadengo commented Mar 29, 2023