Windows: Calling `SpeechSynthesizer.StopSpeakingAsync()` does not stop synthesis #2350

bpasero · 2024-04-24T07:13:38Z

Describe the bug

A call to SpeechSynthesizer.StopSpeakingAsync() does not stop synthesis for a very long time, up to 30 seconds. The log file is here: speech.log

This issue was previously reported without action at #1836 and #2264

To Reproduce

We are building a node.js binding for Speech SDK and the C++ sources mimic the samples. The synthesis is implemented here: https://github.com/microsoft/node-speech/blob/967976ce0f4887a2b5b27f486e5209a51588516f/src/main.cc#L477

The call to StopSpeakingAsync here: https://github.com/microsoft/node-speech/blob/967976ce0f4887a2b5b27f486e5209a51588516f/src/main.cc#L539

To reproduce from that module:

using node.js 18.x on the system
git clone https://github.com/microsoft/node-speech.git
open index.ts and append the snippet [1] at the end
from a terminal cd into the workspace and run npm i
run node index.js

[1]

const t = createSynthesizer({
  modelPath: '<path to TTS model>',
  modelName: 'Microsoft Server Speech Text to Speech Voice (en-US, AriaNeural)',
  modelKey: '<model key>',
}, (error, result) => {
  if (error) {
    console.error(error);
  } else {
    console.log(result);
  }
});
t.synthesize(`
Now more than ever, developers are expected to build voice-enabled applications that can reach a global audience. With the same voice persona across languages, organizations can keep their brand image more consistent. To support the growing need for a single voice to speak multiple languages, particularly in scenarios such as localization and translation, a multi-lingual neural TTS voice is brought out in public preview.



This new Jenny Multilingual voice (preview), with US English as the primary/default language, can speak 13 secondary languages, each at the fluent level: German (Germany), English (Australia), English (Canada), English (Canada), Spanish (Spain), Spanish (Mexico), French (Canada), French (France), Italian (Italy), Japanese (Japan), Korean (Korea), Portuguese (Brazil), Chinese (Mandarin, Simplified).
`);
setTimeout(() => t.stop(), 5000);

Expected behavior

Calling SpeechSynthesizer.StopSpeakingAsync immediately stops synthesis.

Version of the Cognitive Services Speech SDK

1.37.0

Platform, Operating System, and Programming Language

OS: Windows 11 (24H2)
Hardware: ARM
Programming language: C++

Additional context

This issue does not reproduce on macOS or Linux!

The text was updated successfully, but these errors were encountered:

ralph-msft · 2024-04-26T17:39:19Z

Thanks for using the Speech SDK and filing this issue. We have been able to reproduce the issue you are seeing, and have added fixing this issue to our backlog. We will update here once we have an update.

As a temporary workaround, you may want to consider passing a null value as the AudioConfig to the SpeechSynthesizer constructor. You can then subscribe to the Synthesizing event which will be raised whenever the SDK receives new audio from the service. You can then pass this audio to your player of choice which should give you more control over when the audio playback stops. Please note however that calling StopSpeakingAsync may still stall for ~10-15 seconds due the underlying issue.

(B-7172399)

bpasero · 2024-04-26T18:12:24Z

Thanks, good to see it can be reproduced and I am looking forward to the fix 👍

github-actions · 2024-05-16T02:11:37Z

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

wtto00 · 2024-05-16T04:12:32Z

Hello, I am using version 1.37.0, and I have encountered a similar issue.

stopSpeaking does not immediately terminate the playback process; it only stops the speaker from playing.

For example, if I generate a 14-second audio and execute stopSpeaking at 10 seconds, then let speakResult = synthesizer?.speakSsml(ssml) will immediately return with speakResult?.reason=9(SPXResultReason_SynthesizingAudioCompleted) instead of 1(SPXResultReason_Canceled). Moreover, the callback registered with synthesizer?.addSynthesisCompletedEventHandler is triggered after waiting for 4 seconds, rather than the callback registered with synthesizer?.addSynthesisCanceledEventHandler.

let ssml =
          "<speak version='1.0' xml:lang='en-US' xmlns='http://www.w3.org/2001/10/synthesis' xmlns:mstts='http://www.w3.org/2001/mstts'><voice name='\(identifier)'>\(mstts)</voice></speak>"
let speakResult = try self.synthesizer?.speakSsml(ssml)
print(speakResult?.reason ?? "")

try synthesizer?.stopSpeaking()

Here is a demo repositorie: https://github.com/wtto00/flutter_azure_speech/tree/main/example

The swift code is in https://github.com/wtto00/flutter_azure_speech/blob/eb419b89fcc16903cabaa8f9820559d93ed80861/ios/Classes/AzureSpeechPlugin.swift#L294

ralph-msft added bug Something isn't working accepted Issue moved to product team backlog. Will be closed when addressed. labels Apr 26, 2024

bpasero mentioned this issue May 7, 2024

macOS: calling SpeechSynthesizer.StopSpeakingAsync() and then StartSpeakingTextAsync() does not work immediately #2367

Closed

wtto00 mentioned this issue May 9, 2024

SPXSpeechSynthesizer stopSpeaking() method cannot return immediately on iOS 17 #2081

Closed

github-actions bot added the update needed For items that are in progress but have not been updated label May 16, 2024

github-actions bot removed the update needed For items that are in progress but have not been updated label May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows: Calling `SpeechSynthesizer.StopSpeakingAsync()` does not stop synthesis #2350

Windows: Calling `SpeechSynthesizer.StopSpeakingAsync()` does not stop synthesis #2350

bpasero commented Apr 24, 2024 •

edited

ralph-msft commented Apr 26, 2024

bpasero commented Apr 26, 2024

github-actions bot commented May 16, 2024

wtto00 commented May 16, 2024

Windows: Calling SpeechSynthesizer.StopSpeakingAsync() does not stop synthesis #2350

Windows: Calling SpeechSynthesizer.StopSpeakingAsync() does not stop synthesis #2350

Comments

bpasero commented Apr 24, 2024 • edited

ralph-msft commented Apr 26, 2024

bpasero commented Apr 26, 2024

github-actions bot commented May 16, 2024

wtto00 commented May 16, 2024

Windows: Calling `SpeechSynthesizer.StopSpeakingAsync()` does not stop synthesis #2350

Windows: Calling `SpeechSynthesizer.StopSpeakingAsync()` does not stop synthesis #2350

bpasero commented Apr 24, 2024 •

edited