Hardware requirement for inference server #55
-
I've tried out the local speech recognition stuff by enabling multinet and controlling my home assistant entities. So, although neat, I doubt i'd use the on-device multinet speech recognition. Seems unlikely that will improve faster than whisper et al, which is already pretty great. What sort of hardware is needed to run the inference server with whisper locally once you release it? I have a GTX 1080 8GB in a windows machine, state of the art ~8 years ago.. any good? |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 8 replies
-
Hello again! First, I want to thank you for your interest in Willow, you've been very helpful! We will be releasing our highly optimized Willow Inference Server next week. Here are some basic benchmarks:
We use "medium 1" by default (very high quality as you've experienced) and as you can see here a GTX 1060 can do 3.8s of speech in 588ms. The realtime multiple climbs significantly with longer segments. You will be more than fine with your GTX 1080, and another thing to know is we've optimized VRAM usage heavily (thanks to ctranslate2) so all three models we support - base, medium, and large loaded simultaneously occupy less than 3gb of VRAM so your GTX 1080 will still be usable for other tasks when not handling Willow requests. We load all three models because we support Willow selecting the model and other parameters via URI parameters in the request. One caveat for you, though. You will need to use WSL as the Willow Inference Server is Linux only. |
Beta Was this translation helpful? Give feedback.
-
Fab, thanks. I will stop browsing expensive graphics cards then.. The windows machine is for occasional games, but now I have a reason to try out WSL. Leaving open or it hides from the default Discussion view. feel free to close if you want. |
Beta Was this translation helpful? Give feedback.
-
This is all great stuff and unless things get ugly we'll be leaving discussions open. Thanks again! |
Beta Was this translation helpful? Give feedback.
-
@kristiankielhofner Would it be possible to run the WIS on a Jetson Nano? I know it would be relatively slow compared to the listed GPUs, but would it be faster than CPU only? I have one sitting idle and it would be nice to have a small self contained device to run the WIS on if possible. |
Beta Was this translation helpful? Give feedback.
-
My understanding is that a Google Coral can be used for voice. Any chance it could be supported to give a low cost, low power, less space occupying option? |
Beta Was this translation helpful? Give feedback.
-
Will something like the Nvidia Tesla K80 work with the Willow Inference Server? |
Beta Was this translation helpful? Give feedback.
-
Would the GTX 1050ti work? Pascal architecture, has 4GB DDR5, but it's biggest pro is that it doesn't need external power (if I'm not mistaken). Perfect for a small home server that I might not have enough power for. |
Beta Was this translation helpful? Give feedback.
Hello again! First, I want to thank you for your interest in Willow, you've been very helpful!
We will be releasing our highly optimized Willow Inference Server next week. Here are some basic benchmarks: