Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes it's necessary to repeat the wake-up word before Willow wakes up #199

Open
mhilbush opened this issue Jun 23, 2023 · 31 comments
Open

Comments

@mhilbush
Copy link

I have two devices running Willow built from a repo I cloned on June 11 (is there a better way to specify what version I'm running?). Each device is in a completely separate part of the house (1st floor kitchen and lower level family/rec room).

I only ever use the wake word "Alexa".

After not using the device for a while (sometimes as little as a couple hours), I find that I need to say "Alexa" two or more times before I see the screen turn on.

There is very little ambient noise in these rooms. In the kitchen you can barely hear the refrigerator running. In the family room, there's basically nothing generating any ambient noise.

I experience the problem even when I'm quite close (2-3 feet) to the devices. For example, in the kitchen picture below, I can be standing at the counter top directly in front of the device.

Here are pictures so you can see the environment in which the devices are located.

Kitchen
PXL_20230623_144513266~2

Family/Rec Room
PXL_20230623_144846677~2

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

Just chiming in here, I noticed today both my willows will not respond to wake at all after sitting 24 hours idle. I had to power cycle them in order for them to start listening again.

@kristiankielhofner
Copy link
Contributor

When @mhilbush first reported this on the openHAB forums I related that I haven't seen this issue personally. I have an ESP BOX in my bedroom only used to turn off the lights when I go to sleep and there are roughly 24 hours in between voice commands.

That said, I have some devices with storage of logs that are in an acoustically isolated environment for testing. I've flashed them with full debug builds to try to reproduce this with a more scientific approach to "24 hours in between commands/activity". My first attempt will be three hours in between commands and I'll step it up from there.

Stay tuned!

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

When @mhilbush first reported this on the openHAB forums I related that I haven't seen this issue personally. I have an ESP BOX in my bedroom only used to turn off the lights when I go to sleep and there are roughly 24 hours in between voice commands.

That said, I have some devices with storage of logs that are in an acoustically isolated environment for testing. I've flashed them with full debug builds to try to reproduce this with a more scientific approach to "24 hours in between commands/activity". My first attempt will be three hours in between commands and I'll step it up from there.

Stay tuned!

Just for some more info, the devices still replied to WAS commands; I basically rebooted them remotely by doing a config push, and that seemed to bring them back to life. This is also the first time I have noticed this; I am fairly certain I have left them for 24 hours or more without a command before and they worked fine when I said the wake word the next day, so not sure what specific scenario caused this.

@kristiankielhofner
Copy link
Contributor

I assume @mhilbush is running main - @mhilbush can you confirm?

feature/was represents a TON of development with significantly less testing. I'm not necessarily surprised to see this issue reported there but it occuring with main is very surprising.

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

So crazy thing, it just happened to me again just now. Sitting idle for probably 3-ish hours, won't respond to wake word. Is there some way I can store logs or pull log data from it to see if anything is there?

@mhilbush
Copy link
Author

Correct, I'm running off the main branch.

@mhilbush
Copy link
Author

Sitting idle for probably 3-ish hours

Yes, I've seen it happen after being idle for less than an hour. In fact, I've been trying to narrow down the amount of idle time that causes it to miss the first Alexa. And, interestingly, I have seen it happen when it's been idle even for a few minutes. And I'm also now trying to change some attributes of how I say Alexa (e.g. speed, tone).

I would love to know what's happening after I say Alexa the first time when it doesn't wake up.

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

Sitting idle for probably 3-ish hours

Yes, I've seen it happen after being idle for less than an hour. In fact, I've been trying to narrow down the amount of idle time that causes it to miss the first Alexa. And, interestingly, I have seen it happen when it's been idle even for a few minutes. And I'm also now trying to change some attributes of how I say Alexa (e.g. speed, tone).

I would love to know what's happening after I say Alexa the first time when it doesn't wake up.

Just to be clear, in my case it becomes completely unresponsive, meaning I can repeat the wake phrase dozens of times and it will never respond. Also will not wake when touching the touchscreen.

@mhilbush
Copy link
Author

in my case it becomes completely unresponsive, meaning I can repeat the wake phrase dozens of times and it will never respond. Also will not wake when touching the touchscreen.

Yes, that's very different from what I'm seeing. I've never had the device become completely unresponsive like that. Usually after 2 attempts, and sometimes after 3 or 4, it wakes up.

@kristiankielhofner
Copy link
Contributor

@mhilbush - In terms of variation on wake pronunciation, etc to further track this down my sealed testing environment uses recordings played out via a speaker to disambiguate between any potential wake/audio/environment issues and other potential software issues. I'm going to give it a full three hours at idle and attempt to reproduce this in roughly one hour. This environment also runs debug builds with full log capture enabled.

@nikito - Between last night and this morning I observed what you are describing. Interestingly, it only happened on my Box test device and not the Lite so that's potentially a hint in the right direction. Then again could have just been a fluke. More testing should help clear this up.

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

@mhilbush - In terms of variation on wake pronunciation, etc to further track this down my sealed testing environment uses recordings played out via a speaker to disambiguate between any potential wake/audio/environment issues and other potential software issues. I'm going to give it a full three hours at idle and attempt to reproduce this in roughly one hour. This environment also runs debug builds with full log capture enabled.

@nikito - Between last night and this morning I observed what you are describing. Interestingly, it only happened on my Box test device and not the Lite so that's potentially a hint in the right direction. Then again could have just been a fluke. More testing should help clear this up.

It actually happened to me again an hour or so later. Pushing config from WAS again fixed it. So seems somehow something in the whatever code is waiting for the wake word seems to lock up or something to that effect? Haven't tried to debug or anything (not really sure how to live debug the Boxes 😆 ) I suppose I could try to plug one into my server and monitor the serial port to see if it notes anything weird before it happens?

@kristiankielhofner
Copy link
Contributor

Strange but I'm actually relieved to hear it's repeatable after an hour as opposed to 24!

Yes, if you can do a debug build with connection to serial and record logs that is very helpful (I'm taking the same approach).

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

Strange but I'm actually relieved to hear it's repeatable after an hour as opposed to 24!

Yes, if you can do a debug build with connection to serial and record logs that is very helpful (I'm taking the same approach).

Something else I noticed, when it first boots it seems VAD doesn't work on the first command; I say something like "What time is it?" and it takes up to the 5 second timeout to then acknowledge the command and reply. If I repeat the command the VAD is near instant from then on for all other commands. Not a huge deal, but figured I'd mention it. 😄

@mhilbush
Copy link
Author

mhilbush commented Jun 28, 2023

So now I have an audio recording of me saying Alexa. I'm doing some tests with it now. Leaving several minutes between each attempt. It's bizarre. I'm not changing the location of the device or my phone. There's little to no background noise.

  • attempt 1: woke up the 3rd time I played it
  • attempt 2: woke up the 1st time
  • attempt 3: woke up the 2nd time
  • attempt 4: woke up the 1st time
  • attempt 5: woke up the 1st time
  • attempt 6: woke up the 5th time

BTW is there an Android app would you recommend for capturing audio?

Edit: Here's a pic of where I am in relation to the device. I'm about 5 to 6 feet from it. Basically, I hold the phone near my head and play the audio. If it doesn't wake up, I wait a few seconds, then play it again.
PXL_20230628_143336345 MP~2

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

So quick update, since turning on Debug and deploying to both my ESP Boxes, neither has had the issue now. If I don't notice it happen again I may flip one of the boxes to a non-debug build for laughs, just to see if that somehow makes a difference. 😆

@hamishcunningham
Copy link
Contributor

@nikito it is actually quite common to see problems that don't manifest with debug on or etc on this type of device of course. It tends to point to something timing related?

@nikito
Copy link
Contributor

nikito commented Jun 28, 2023

Seeing some interesting numbers in the debug:
image
Not sure if these are normal or if it is doing something odd 😄

@kristiankielhofner
Copy link
Contributor

@nikito - VAD on first boot is a known issue. We think it has something to do with audio init and the audio front end framework calibrating/settling/something but we haven't explored it further as it's very low on our list of priorities.

You are seeing some FUNKY numbers from that debug output... I have no idea what's going on there.

I can also confirm with my testing today with three separate attempts in my controlled environment spaced by three hours each with debugging enabled the issue does not occur. I'm going to do latest regular builds and flash with those to see if I can trigger it with the same time spacing (although I'm not sure how helpful the output will be). At least we're running WAS so OTA is painless :).

@kristiankielhofner
Copy link
Contributor

@hamishcunningham - Yep, unfortunately. We will see when I run my non-debug builds but depending on debug options selected they run additional tasks that will keep the device warm(er). Although with wifi management, SNTP, etc it's not as though the device ever sleeps or similar but it will be a good datapoint regardless.

@kristiankielhofner
Copy link
Contributor

@mhilbush That is bizarre and a terrible experience. What I can tell you from my experiences recording and playing back prompts it needs to be as high fidelity as possible. In my case I record on my desktop with my Logitech C920 microphone as the source in 48kHz audio. I then play it back with no resampling, conversion, etc. We have HIGHLY accurate and repeatable results with this process - our routine "torture testing" is 1000 repetitions of a couple of different recordings. The last test run like this woke, captured speech, and successfully executed the HA command 996/1000 times on the ESP BOX.

I don't use Android so I can't recommend a good app but the other potential issue is that many devices, apps, etc do their own front end audio processing which could result in a lack of fidelity on playback from the perspective of the ESP BOX. Additionally, the playback hardware factors in as well (of course) and the frequency response of the speakers, etc from the device could be a factor as well. That said you're having terrible results with your own speech so it may not be a factor at all.

@mhilbush
Copy link
Author

That said you're having terrible results with your own speech so it may not be a factor at all.

Correct. I only switched to a recording (which I discovered is 32kHz) to remove the variable that my own speech might be inconsistent from one try to the next. I see no difference between using my own voice and the recording of my voice.

At this point I'm really at a loss as to what to do next. If the wake word isn't reliable for me, everything else is pretty much pointless.

I'm starting to wonder if there's something wrong with the lot of devices from which I purchased my 2.

I'm not sure where you're located, but would you like me to ship you one of my devices?

@kristiankielhofner
Copy link
Contributor

@mhilbush We've placed so much emphasis on wake word because of exactly your point - wake recognition is the equivalent of the power switch. If it doesn't activate, it is useless. So I completely understand your frustration there.

It's unlikely you have two bad units. From what I can remember I think we've only had one unit across all users which turned out to be defective. In terms of hardware the biggest issues seem to be power supplies and cables.

I should note that Willow from concept to now is just under three months old with the first developer "release" being roughly six weeks ago. It is extremely young and wake word, far field audio, speech recognition, etc for every speaker in every environment is very difficult. Alexa is eight years old and other open source projects in the space like Rhasspy are several years old. I am very confident that with time, testing, and additional development we will achieve our goals of being a great voice user interface for everyone. Time being weeks/months - not years. I understand that doesn't help you now but I wanted to add some context.

I think you can appreciate that if your experience was typical we wouldn't have gotten anywhere and our GH issues would be full of hundreds if not thousands of these kinds of reports by now.

I think it's still a bit too early in exploring this issue to take the drastic step of shipping a device but I appreciate the offer.

I just did another 2.5 hour idle test across box and box lite, woke right up immediately and successfully executed the command as expected. @nikito - this was with only debug logging turned on (no task/mem printing, etc).

@nikito
Copy link
Contributor

nikito commented Jun 29, 2023

@kristiankielhofner yeah I noticed my devices with debug, even without monitoring, seemed fine all day. As a test I left one on debug build and set another to a non-debug build. I'll check both tomorrow and see how they fare. 😄

@nikito
Copy link
Contributor

nikito commented Jun 29, 2023

Update this morning, left the devices idle all night, one on debug and one with no debug turned on, and they both responded and worked fine this morning. Really not sure what the issue would be, maybe reflashing somehow fixes it? 😆

@kristiankielhofner
Copy link
Contributor

@nikito - same here. Like any transient issue good and bad - it "fixed" itself for unknown and speculative reasons. Of course depending on underlying cause it may very well come back on us... Regardless of the short-term outcome we'll definitely leave this issue open for a while in the event it comes back in the near future.

@mhilbush - Somewhat embarrassingly it took me way too long to come up with another debug step on your issue. Can you provide your existing recordings? If you're okay with that you should be able to upload them here or provide a link. Additionally, going the other direction, the recordings I use (of my own voice) for the testing I have described are in tree at misc/*.flac. You may not have the associated command endpoint to actually execute the command(s) but when you play them back it will likely provide additional insight on your issue of failure to wake by separating voice from environment and specific unit.

@mhilbush
Copy link
Author

@kristiankielhofner No problem. I have one recording only - just the wake word Alexa. 😉 Since I couldn't wake it up reliably, I haven't created any other recordings. And, I wanted to find an app that would let me have more control over the audio quality.

Here's a link:
https://drive.google.com/file/d/1Xo1E20ONlCbyOPRQLiHK7vapyxIBZuUT/view?usp=sharing

I should note that Willow from concept to now is just under three months old with the first developer "release" being roughly six weeks ago. It is extremely young and wake word, far field audio, speech recognition, etc for every speaker in every environment is very difficult.

I understand that Willow is very new. And please don't get me wrong, I'm super impressed with what you've been able to accomplish in such a short time. But perhaps there's something I don't quite understand. I thought the wake words "Alexa" and "Hi ESP" were part of the Espressif platform, not part of Willow. Possibly just my misunderstanding. And, if they are part of the platform, I certainly don't know when or how well that aspect of the platform was created.

@bert269
Copy link

bert269 commented Jul 25, 2023

I came from the "main"=-branch, meaning flashing willow from a connected server. No noticible errors / blanks screens.
Last week I installed and run WAS on the same server where I has WIS running.
After I flashed three ESP-BOXs and one -Lite, the units would all just freeze up with black screen. There is no specific time elapse before it happens again - it seems to be random. A power-cycle is the only thing that makes them respond again.

In the mean while WAS does not report them to be disconnected from the server.......But the screens are blank and not responding.

@nikito
Copy link
Contributor

nikito commented Jul 25, 2023

I came from the "main"=-branch, meaning flashing willow from a connected server. No noticible errors / blanks screens. Last week I installed and run WAS on the same server where I has WIS running. After I flashed three ESP-BOXs and one -Lite, the units would all just freeze up with black screen. There is no specific time elapse before it happens again - it seems to be random. A power-cycle is the only thing that makes them respond again.

In the mean while WAS does not report them to be disconnected from the server.......But the screens are blank and not responding.

I believe this is a known issue, but isn't yet deployed to the builds WAS is using. You would need to follow the steps to pull willow from the feature/was branch and manually create a build for local OTA and push. It would look like this:
image

@bert269
Copy link

bert269 commented Jul 25, 2023

Sorry for my stupidty - a total NOOB here with github and 'pulling a build".
This is what I see next to my only two connected ESP-BOXs:
image

When I click on the OTA button next to the ESP-BOX, the ESP-BOX get's flashed (or it cycles power) and does not appear under WAS anymore.
How do I "pull willow from a feature/was branch" and manually create a build?

As I said earlier, I flashed the "main branch" to one ESP-BOX and the -Lite and they have been up and running without "blanking out" for a few hours now. But they also does not show up under WAS and the TTS responses are not working as in the WAS builds I had.

@nikito
Copy link
Contributor

nikito commented Jul 25, 2023

This would involve building the willow code as outlined on the willow repo (would have to git pull feature/was branch) and then following the steps on the was repo to copy the build into the was container. This is all stuff that will go away for regular users with 1.0 release, most of these activities would only be done by devs 😄

@stintel
Copy link
Collaborator

stintel commented Aug 17, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants