Whisper support #180

gottlike · 2023-06-21T07:06:07Z

Is support for Whisper on the roadmap? Something like https://github.com/ggerganov/whisper.cpp would be great.

zhuohan123 · 2023-06-21T14:36:36Z

Supporting encoder-decoder models is in our roadmap as mentioned in #187. Feel free to join the discussion and potentially contribute!

libratiger · 2023-09-14T09:22:29Z

+1 for this feature

silvacarl2 · 2023-09-23T14:52:44Z

+2 for this feature

xtqxk · 2023-10-24T03:19:46Z

+3 for this feature

arun2728 · 2023-12-01T04:25:17Z

+4 for this feature

SinanAkkoyun · 2023-12-15T08:39:52Z

+555

Swiffers · 2024-01-02T18:41:45Z

+1

hahazei · 2024-02-26T09:43:01Z

+1

binarycrayon · 2024-02-26T21:37:07Z

monitoring

afeldman-nm · 2024-02-28T20:16:05Z

@zhuohan123 I am working on Whisper support.

silvacarl2 · 2024-02-28T20:20:11Z

NO WAY!!!!!!!!!!!!!!!!!!! THAT WILL BE AWESOME!!!!!!!!!!!!!!!!!!!!!

libratiger · 2024-03-04T02:34:14Z

I am working on this PR, and will soon submit the draft.

silvacarl2 · 2024-03-04T16:33:01Z

THIS IS GOING TO BE HUGE, THX!

dbogunowicz · 2024-03-12T15:44:29Z

Hey @libratiger, together with @afeldman-nm I am now working full-time on the same target. Would you like to sync? It would be more efficient to share knowledge, rather than develop the same thing in two silos.

libratiger · 2024-03-13T02:32:21Z

You're right. I've just discovered a discussion about T5 #187 (comment) , where there are differing opinions on the encoder-decoder model. Perhaps it will improve after that PR is merged?

dbogunowicz · 2024-03-13T12:27:38Z

@libratiger the current status is as follows: neural magic has finalized the original T5 PR, and we are now benchmarking the solution. In parallel, we are also developing support for Whisperer.

JackZeng · 2024-03-28T08:47:32Z

@dbogunowicz any update on this issue? looking forward

dbogunowicz · 2024-03-28T13:02:04Z

Hi! I am working on the Whisper on our team fork: neuralmagic#147
The status is: I am running the inference (both prompt prefill as well as autoregressive inference), but I get correctness issues, most likely caused by the erroneous attention mask implementation.

junior-zsy · 2024-04-02T10:58:03Z

@dbogunowicz I ran the feature/demian/Whisper branch to run the Whisper model and found an error message: vllm/worker/model_runner. py, line 477, in prepare_decode
Multi_modeal_input)
NameError: name 'multi_modal_input' is not defined, code execution cannot start

dbogunowicz · 2024-04-02T12:22:41Z

@junior-zsy fixed for now. Please remember, that we are still working on that PR, so it's pretty much in WiP state. Let me explicitly set the appropriate PR flag.

junior-zsy · 2024-04-03T02:14:14Z

@dbogunowicz Ok, thank you. Hope it can be used soon

silvacarl2 · 2024-04-03T13:57:19Z

same here, this is going to be really cool!

afeldman-nm · 2024-04-03T14:11:53Z

@dbogunowicz thanks for your work on Whisper! Since there is clearly interest in this feature and its completion timeline, I want to add the context that Whisper support takes a dependency on encoder/decoder support -

Issue: #187
PR: #3117

which is also WIP (currently works partially but is not quite complete.) I expect to complete encoder/decoder support soon. JFYI for anyone interested in timelines.

dwoodworth90 · 2024-04-26T08:04:34Z

+1

afeldman-nm · 2024-04-30T13:55:55Z

See the encoder/decoder support issue (#187) and new PR (#4289) for a status update on encoder/decoder support, which is a prereq for Whisper support.

twicer-is-coder · 2024-05-21T09:16:12Z

Hi, any update on serving faster-whisper via VLLM?

afeldman-nm · 2024-05-23T17:26:52Z

Hi, any update on serving faster-whisper via VLLM?

Hi @twicer-is-coder ,

Whisper (or any variant thereof) is high of the list of models to add once infrastructure support is in; you can see the roadmap for infrastructure support in this PR:

#4942

WoosukKwon added the new model Requests to new models label Jun 21, 2023

zhuohan123 mentioned this issue Jun 25, 2023

[Deprecated] vLLM Development Roadmap #244

Closed

76 tasks

viktor-ferenczi mentioned this issue Sep 23, 2023

support whisper? #1152

Closed

afeldman-nm mentioned this issue Feb 28, 2024

Adding support for encoder-decoder models, like T5 or BART #187

Open

This was referenced Apr 2, 2024

[WIP] Upstream encoder/decoder support based on multiple blocktables neuralmagic/nm-vllm#161

Draft

[WIP] Upstream encoder/decoder support based on multiple blocktables afeldman-nm/vllm#3

Open

afeldman-nm mentioned this issue May 21, 2024

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) #4837

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper support #180

Whisper support #180

gottlike commented Jun 21, 2023 •

edited

zhuohan123 commented Jun 21, 2023

libratiger commented Sep 14, 2023

silvacarl2 commented Sep 23, 2023

xtqxk commented Oct 24, 2023

arun2728 commented Dec 1, 2023

SinanAkkoyun commented Dec 15, 2023

Swiffers commented Jan 2, 2024

hahazei commented Feb 26, 2024

binarycrayon commented Feb 26, 2024

afeldman-nm commented Feb 28, 2024

silvacarl2 commented Feb 28, 2024

libratiger commented Mar 4, 2024

silvacarl2 commented Mar 4, 2024

dbogunowicz commented Mar 12, 2024 •

edited

libratiger commented Mar 13, 2024

dbogunowicz commented Mar 13, 2024

JackZeng commented Mar 28, 2024

dbogunowicz commented Mar 28, 2024

junior-zsy commented Apr 2, 2024

dbogunowicz commented Apr 2, 2024

junior-zsy commented Apr 3, 2024

silvacarl2 commented Apr 3, 2024

afeldman-nm commented Apr 3, 2024 •

edited

dwoodworth90 commented Apr 26, 2024

afeldman-nm commented Apr 30, 2024 •

edited

twicer-is-coder commented May 21, 2024

afeldman-nm commented May 23, 2024

Whisper support #180

Whisper support #180

Comments

gottlike commented Jun 21, 2023 • edited

zhuohan123 commented Jun 21, 2023

libratiger commented Sep 14, 2023

silvacarl2 commented Sep 23, 2023

xtqxk commented Oct 24, 2023

arun2728 commented Dec 1, 2023

SinanAkkoyun commented Dec 15, 2023

Swiffers commented Jan 2, 2024

hahazei commented Feb 26, 2024

binarycrayon commented Feb 26, 2024

afeldman-nm commented Feb 28, 2024

silvacarl2 commented Feb 28, 2024

libratiger commented Mar 4, 2024

silvacarl2 commented Mar 4, 2024

dbogunowicz commented Mar 12, 2024 • edited

libratiger commented Mar 13, 2024

dbogunowicz commented Mar 13, 2024

JackZeng commented Mar 28, 2024

dbogunowicz commented Mar 28, 2024

junior-zsy commented Apr 2, 2024

dbogunowicz commented Apr 2, 2024

junior-zsy commented Apr 3, 2024

silvacarl2 commented Apr 3, 2024

afeldman-nm commented Apr 3, 2024 • edited

dwoodworth90 commented Apr 26, 2024

afeldman-nm commented Apr 30, 2024 • edited

twicer-is-coder commented May 21, 2024

afeldman-nm commented May 23, 2024

gottlike commented Jun 21, 2023 •

edited

dbogunowicz commented Mar 12, 2024 •

edited

afeldman-nm commented Apr 3, 2024 •

edited

afeldman-nm commented Apr 30, 2024 •

edited