POC: DO NOT MERGE (speech-to-text) - Whisper #71
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request contains a quick proof of concept to showcase how easy it is to add a new pipeline. I added the
speech-to-text
pipeline in this example using openai/whisper-large-v3. The beutifull thing about this pipeline is that it can be cold served since model load times are < 3s. Further it only requires 6.5 GB Vram and therefore can be done on lower VRam cards.You can test it out using audio samples from https://audio-samples.github.io/ and starting up a pipeline using the Runner documentation. You can then execute the pipeline running on https://localhost:8000/docs.
Warning
NOT PRODUCTION READY DO NOT MERGE.