OpenAI tts with word boundary event result #2303

ling-k · 2024-03-15T21:46:34Z

Describe the solution you'd like
The openAI TTS voices, such as en-US-AlloyMultilingualNeuralHD, en-US-EchoMultilingualNeuralHD, do not return word boundary event results.
Additional context

def word_boundary_handler(evt):
    print(f"My Word boundary event received: {evt.text}, audio offset in ms: {evt.audio_offset / 10000}ms")
    word_boundaries[str(evt.audio_offset / 10000)] = evt.text 
    
   For Azure voices, such as "en-US-EmmaNeural", "zh-CN-XiaoxiaoNeural", it works fine. But for openAI voices, it does not return anything.

The text was updated successfully, but these errors were encountered:

BrianMouncer · 2024-03-20T19:42:03Z

The models used for the OpenAI tts voices do not provide word level timing information, so it is currently not possible to get those events when using those voices. There are also a few limitations related to what SSML tags are supported by the OpenAI voices. You can find more information about those limitations here https://learn.microsoft.com/en-us/azure/ai-services/speech-service/openai-voices#ssml-elements-supported-by-openai-text-to-speech-voices-in-azure-ai-speech

I will work with our docs team to also document the other limitations, like word level timing data.

yulin-li · 2024-03-22T09:19:00Z

The wordboundary is not supported in AOAI voices yet.

We should update the doc, @Kerry-LinZhang could you help track the doc refresh?

BrianMouncer · 2024-03-29T20:18:52Z

@ling-k. For future planning, what data center region are you using, that you want word level timing support from OpenAI voices, and or what other regions are most important to you.

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions

ling-k · 2024-03-29T20:32:56Z

@ling-k. For future planning, what data center region are you using, that you want word level timing support from OpenAI voices, and or what other regions are most important to you.

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions

Thanks for replying. I mainly work on West US or West US 2 regions.

pankopon · 2024-04-12T00:18:13Z

@yulin-li @Kerry-LinZhang So I guess this needs a documentation update at least, possibly also creation of a future work item? Please update status and close when done.

Kerry-LinZhang · 2024-04-23T08:52:08Z

Hi @ling-k we will update our Doc related in this week and I will keep you updated once it has been released.

Kerry-LinZhang · 2024-04-30T05:37:55Z

Hi @ling-k We have updated our Doc, please find it at: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=tts#multilingual-voices

github-actions · 2024-05-20T02:12:36Z

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

pankopon · 2024-05-21T21:12:14Z

Closed as resolved.

ling-k changed the title ~~OpenAI tts with with word boundary event result~~ OpenAI tts with word boundary event result Mar 15, 2024

BrianMouncer added the pending close Closed soon without new activity label Mar 20, 2024

pankopon assigned yulin-li Apr 12, 2024

pankopon added enhancement New feature or request in-review In review text-to-speech Text-to-Speech and removed pending close Closed soon without new activity labels Apr 12, 2024

github-actions bot added the update needed For items that are in progress but have not been updated label May 20, 2024

pankopon closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI tts with word boundary event result #2303

OpenAI tts with word boundary event result #2303

ling-k commented Mar 15, 2024

BrianMouncer commented Mar 20, 2024

yulin-li commented Mar 22, 2024

BrianMouncer commented Mar 29, 2024 •

edited

ling-k commented Mar 29, 2024

pankopon commented Apr 12, 2024

Kerry-LinZhang commented Apr 23, 2024

Kerry-LinZhang commented Apr 30, 2024

github-actions bot commented May 20, 2024

pankopon commented May 21, 2024

OpenAI tts with word boundary event result #2303

OpenAI tts with word boundary event result #2303

Comments

ling-k commented Mar 15, 2024

BrianMouncer commented Mar 20, 2024

yulin-li commented Mar 22, 2024

BrianMouncer commented Mar 29, 2024 • edited

ling-k commented Mar 29, 2024

pankopon commented Apr 12, 2024

Kerry-LinZhang commented Apr 23, 2024

Kerry-LinZhang commented Apr 30, 2024

github-actions bot commented May 20, 2024

pankopon commented May 21, 2024

BrianMouncer commented Mar 29, 2024 •

edited