-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenAI tts with word boundary event result #2303
Comments
The models used for the OpenAI tts voices do not provide word level timing information, so it is currently not possible to get those events when using those voices. There are also a few limitations related to what SSML tags are supported by the OpenAI voices. You can find more information about those limitations here https://learn.microsoft.com/en-us/azure/ai-services/speech-service/openai-voices#ssml-elements-supported-by-openai-text-to-speech-voices-in-azure-ai-speech I will work with our docs team to also document the other limitations, like word level timing data. |
The wordboundary is not supported in AOAI voices yet. We should update the doc, @Kerry-LinZhang could you help track the doc refresh? |
@ling-k. For future planning, what data center region are you using, that you want word level timing support from OpenAI voices, and or what other regions are most important to you. https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions |
Thanks for replying. I mainly work on West US or West US 2 regions. |
@yulin-li @Kerry-LinZhang So I guess this needs a documentation update at least, possibly also creation of a future work item? Please update status and close when done. |
Hi @ling-k we will update our Doc related in this week and I will keep you updated once it has been released. |
Hi @ling-k We have updated our Doc, please find it at: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=tts#multilingual-voices |
This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label. |
Closed as resolved. |
Describe the solution you'd like
The openAI TTS voices, such as en-US-AlloyMultilingualNeuralHD, en-US-EchoMultilingualNeuralHD, do not return word boundary event results.
Additional context
The text was updated successfully, but these errors were encountered: