Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper encode roughly 4x slower than openai/pytorch #1699

Open
whispy-woods opened this issue May 15, 2024 · 1 comment
Open

Whisper encode roughly 4x slower than openai/pytorch #1699

whispy-woods opened this issue May 15, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@whispy-woods
Copy link

Obviously the encoding time is almost a non-issue, only when you are working on very small audio chunks it could even hope to shave off some meaningful total percentage of runtime.

I just wanted to mention it in case it is flying under the radar and there might be a quick fix to it. For example, on RTX 4080 both Linux/Windows the encode takes around 0.08s in ctranslate2 and 0.02s with the openAI reference implementation, same 4x difference on two other systems with RTX 4090 and RTX 4060. Thanks for all the work!

@minhthuc2502 minhthuc2502 added the enhancement New feature or request label May 20, 2024
@BBC-Esq
Copy link

BBC-Esq commented May 23, 2024

Do you have some of the code that you could share so we can see what might be going on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants