Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pause and resume capability to TTS playback thread #2940

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

krisgesling
Copy link
Contributor

Description

This adds pause and resume functionality to the TTS playback thread for use in barge in.

From Ken:

Barge-In is not an issue for Precise, however, core has some issues. When a wave file is speaking the recognizer confuses the wav file for the mic. Ducking helps a bit with this but its more useful as an audible confirmation as the recognizer still favors this over the mic input. The proper solution is to pause the audio output until the recognizer has consumed the utterance.

The root problem arises from the tts module which does not use the audio service but it wants to act like one. It has start and stop method and a queue but no pause or resume. This PR corrects that. It also contains the code which uses this new functionality to help with the barge-in process.

Replication of #2939 against the dev branch so it doesn't get lost.

How to test

Unit tests need to be added.

Enable your barge-in config flag and use a long playing dialog skill like the new duck duck go or wiki skill (these return an abstract rather than a single sentence) and then say 'stop'. Without this PR this will not work 90% of the time.

Contributor license agreement signed?

  • CLA

@devops-mycroft devops-mycroft added the CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors) label Jul 2, 2021
@krisgesling krisgesling mentioned this pull request Jul 2, 2021
@devops-mycroft
Copy link

Voight Kampff Integration Test Succeeded (Results)

@codecov-commenter
Copy link

Codecov Report

Merging #2940 (a50ad82) into dev (80f9eb0) will decrease coverage by 0.02%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #2940      +/-   ##
==========================================
- Coverage   52.64%   52.62%   -0.03%     
==========================================
  Files         123      123              
  Lines       11021    11039      +18     
==========================================
+ Hits         5802     5809       +7     
- Misses       5219     5230      +11     
Impacted Files Coverage Δ
mycroft/client/speech/__main__.py 0.00% <0.00%> (ø)
mycroft/tts/tts.py 77.88% <46.15%> (-1.59%) ⬇️
mycroft/audio/speech.py 90.65% <71.42%> (-1.43%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80f9eb0...a50ad82. Read the comment docs.

Copy link
Collaborator

@forslund forslund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is a possible lock scenario if the thread is stopped while in a pause waiting state (and also the speech client is shut down), unlikely but should is simple to put in safeguards against. See forslund@33a430e for a simple fix.

The other comments are mainly suggestions on improvements and can be ignored. If they are deemed useful you can cherry-pick them from forslund/mycroft-core/tree/feature/tts-pause-resume

@@ -124,7 +124,7 @@ def mute_and_speak(utterance, ident, listen=False):
tts.init(bus)
tts_hash = hash(str(config.get('tts', '')))

LOG.info("Speak: " + utterance)
LOG.debug("Listen=%s, Speak:%s" % (listen, utterance))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's valid to keep this as an info level LOG, it's something that is generally good to be able to track even for a user. Also generally we use some sort of format string instead of the old % system. If % is used in logging it should use the logging lazy string evaluation i.e.

LOG.debug("Listen=%s, Speak:%s", listen, utterance)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should be kept as an info level log. Knowing what the device thought it heard and what it said in response is quite useful.

@@ -174,6 +174,14 @@ def handle_stop(event):
bus.emit(Message("mycroft.stop.handled", {"by": "TTS"}))


def handle_pause(event):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing docstrings for this and the handle_resume

@@ -93,6 +94,8 @@ def run(self):
listening.
"""
while not self._terminated:
while self._paused:
sleep(0.2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This while-sleep pattern can usually be replaced with an Event(). See forslund@8dc0fd6 for a suggestion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go one step further. Why use threading.Event() when we have an event-driven architecture with a message bus? What is a good reason to use threading.Event() instead?

Copy link
Collaborator

@forslund forslund Jul 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean terminate the playback thread and restart it on the message?

The threading.Event() is generally a good way for cyclic code to sync with an external event like the case is here. Basically block until event.

if self._paused:
self.p.terminate()
break
sleep(0.1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of busy waiting should be avoided if possible IMO. See forslund@2c050fb for a suggestion

@krisgesling krisgesling added this to Sprint 24: Music and Common Play in Upcoming Sprints Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors)
Projects
No open projects
Status: High priority
Upcoming Sprints
Sprint 24: Music and Common Play
Development

Successfully merging this pull request may close these issues.

None yet

6 participants