Skip to content

Hardware requirement for inference server #55

Answered by kristiankielhofner
RJ asked this question in Q&A
Discussion options

You must be logged in to vote

Hello again! First, I want to thank you for your interest in Willow, you've been very helpful!

We will be releasing our highly optimized Willow Inference Server next week. Here are some basic benchmarks:

Device Model Beam Size Speech Duration (ms) Inference Time (ms) Realtime Multiple
RTX 4090 large-v2 5 3840 140 27x
H100 large-v2 5 3840 294 12x
H100 large-v2 5 10688 519 20x
H100 large-v2 5 29248 1223 23x
GTX 1060 large-v2 5 3840 1114 3x
Tesla P4 large-v2 5 3840 1099 3x
RTX 4090 medium 1 3840 84 45x
GTX 1060 medium 1 3840 588 6x
Tesla P4 medium 1 3840 586 6x
RTX 4090 medium 1 29248 377 77x
GTX 1060 medium 1 29248 1612 18x
Tesla P4 medium 1 29248 1730 16x
RTX…

Replies: 7 comments 8 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by RJ
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
4 replies
@kristiankielhofner
Comment options

@gitalpaca
Comment options

@rgregrowe
Comment options

@kristiankielhofner
Comment options

Comment options

You must be logged in to vote
1 reply
@kristiankielhofner
Comment options

Comment options

You must be logged in to vote
1 reply
@kristiankielhofner
Comment options

Comment options

You must be logged in to vote
2 replies
@kristiankielhofner
Comment options

@justynbell
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
7 participants