Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Built-in interruption support through Wake Word #2638

Open
onpon4 opened this issue Jul 19, 2020 · 3 comments
Open

[Proposal] Built-in interruption support through Wake Word #2638

onpon4 opened this issue Jul 19, 2020 · 3 comments

Comments

@onpon4
Copy link

onpon4 commented Jul 19, 2020

So I tried MyCroft recently and was relatively impressed with what it's able to do! It's not as powerful as Alexa et al, but it's quite capable and I love that it's written in Python and seems easy to extend.

That said, one major problem which has been mentioned before a couple years ago is that MyCroft can't always be interrupted when it's in the middle of saying something. This can be a real problem if a skill ends up saying something real long without having a "stop" command implemented.

Of course, #1478 is a major part of that. MyCroft can't really do anything with the wake word if it can't hear it to begin with.

But in addition to that, I'd like to propose one other thing: a built-in feature to the MyCroft core which always listens for the Wake Word, no matter what it's doing, and if it hears the Wake Word, interrupts what it's doing and does the new command instead. This would allow you to always be able to rely on being able to say things like "Hey MyCroft, stop" or "Hey MyCroft, nevermind" when you don't want to hear what it's saying anymore, and it would prevent the possibility of accidentally getting stuck doing somethting you don't want it to do. Commands like "continue" and "go on" could exist in those situations to resume what it was doing before in case of false positives.

The main reasons for this proposal are as follows:

  • Consistency. A MyCroft user would always know that they can interrupt MyCroft, and that they can resume after interrupting if they do so by mistake.
  • Simplification. Skills would have to implement their own "stop" commands; it would just be built in.

Of course I have little understanding of the internal workings of MyCroft, so do let me know if this proposal is way off-base. 😅

🕷️

@krisgesling
Copy link
Contributor

Hi Layla, welcome to Mycroft, glad you are liking it :)

This is often referred to here as "barge-in". The biggest impediment to this, as you mentioned, is Mycroft being able to hear the wake word. Doing this solely in software is hard. We put the bounty out for quite a while and a number of people had a go, but entire companies exist to solve just this probably so it was always a long shot.

The best solution to this is to use better hardware. Microphone arrays like we use in the Mark II provide us with some really useful new tools like Acoustic Echo Cancellation (AEC). Check out this video from Seeed to see just how much of a difference it makes.

So barge-in support exists in Mycroft already, it's the limitation of being able to hear the user that is a challenge, particularly on Picroft and other devices where users might be using a wide range of input devices.

We have been discussing some other methods to improve wake word detection in a range of environments and with other background noise, but our first focus there is on improving detection of more diverse voices. Currently the experience is quite bad for most women and children.

In terms of the "stop" command, there is a default stop function that will do things like end media playback, and then Skills can add their own custom stop method if they are doing something else that might require "stopping".

In terms of speech dialog, I believe that only the Skill that initiated some dialog can stop the output of that speech. So this is currently up to Skill developers on whether that is appropriate for their Skill. However I can certainly see the argument for having this as universal behaviour.

@onpon4
Copy link
Author

onpon4 commented Jul 20, 2020

One idea that comes to mind is that you could have an entirely separate API for just stopping whatever MyCroft is doing, probably implemented as a button or keyboard combination (though it's not clear how one would do this on a PC unless you've got a nice GUI interface going). That would be a lot easier to implement.

It's the dialog that's the particular problem really. You can see it real easily with the counting skill (tell MyCroft to count to 500 or something). Currently the only way we're able to get it to stop after doing that is by killing MyCroft, which isn't ideal.

🕵️

@krisgesling
Copy link
Contributor

Just created a PR for the Count Skill specifically, but I do see your point more broadly.

Not something we can prioritise right now so I'll leave this open unless others have ideas or want to propose a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants