Support for external speaker or Bluetooth? #164

DanPatten · 2023-06-10T18:17:28Z

Does the ESP32-S3-BOX support either Bluetooth or the ability to plugin an aux so eventually TTS or Spotify output can go to that?

Thanks

kristiankielhofner · 2023-06-13T14:43:52Z

Not only is it planned, it's included in the README.

tensiondriven · 2023-07-01T01:45:42Z

I'd love to see some sort of software-only "willow" that would run as a daemon under MacOS/Ubuntu/Window, and act as a bridge to willow-inference-server, so one could use an onboard microphone, external microphone over aux, or potentially bluetooth mic, without the need for an ESP.

kristiankielhofner · 2023-07-03T13:28:16Z

We've actually been working on this!

At the moment it's a desktop app that uses the WIS WebRTC interface. It's currently hotkey activated (we bind to a user-defined global OS hotkey combination) and doesn't support VAD but it's something we're definitely interested in.

A6blpka · 2023-08-10T11:56:03Z

I have some bad news. ESP32-S3 does not support classic bluetooth, that means it does not support A2DP. Classic bluetooth is only supported when using classic ESP32.
So we won't be able to use external speakers via classic bluetooth :(

ashishpandey · 2023-10-02T17:33:06Z

Is there any possibility to wire up a speaker via the pcie expansion port, gpio or the usb host port on the dock for BOX-3

While the mics are good at picking up voice, the Inbuilt speaker is very faint for most practical feedback.

kristiankielhofner · 2023-10-03T13:09:48Z

The "PCIe" expansion port is not actually electrically PCIe. It was an easy existing connector for Espressif to use for the dock modules. Their materials are pretty confusing in this regard, I initially thought the same thing. Ideally they wouldn't mention "PCIe" at all and simply call it an expansion port or similar.

We are in talks with Espressif to make an "audio dock" or similar that would likely have some kind of electrical pin map to either the DAC built in to the device or a defined external DAC that would expose a 3.5mm connector or similar on the dock. Otherwise we'd be looking at supporting some kind of specific external DAC that users would need to purchase and wire up manually to SPI, I2C, etc via the GPIO ports or support for the USB-A host port that would introduce similar variability with various USB audio devices on the market.

There is also some discussion for this audio dock (or another variant) including an additional bluetooth chip that can support "full Bluetooth" for A2DP, etc that you could use with an external bluetooth speaker.

We completely understand that the built-in speaker is lacking for some users but this is one of the cost saving measures deployed that makes it possible to provide otherwise very capable hardware at a ~$50 price point. That said as the (lack of) activity on this issue demonstrates many users are fine with the built in speaker which is why the priority for us and Espressif has been on an external dock for connection to a user-provided and preferred speaker so users can spend the extra money for the audio dock to support their use case (as opposed to increasing the price, size, etc for everyone). Even if the speaker included with the hardware was more capable there will always be users that will find it inadequate compared to what they prefer for their intended use cases offering superior audio output. You can see the same thing in the Echo ecosystem.

Also, generally speaking this opens up a little bit of a can of worms as using random speakers with full-duplex use cases (playing music and attempting to issue commands, for example) involves some acoustic issues that we will have to account for to provide a similar level of high quality speech recognition.

I should also note that we have a to-do item to normalize amplitude across the response types. WIS TTS, for example, is very faint and the audio levels for TTS in WIS can be boosted to provide higher volume output, we just need to make sure we don't overdrive the speaker and enable the potential for audio clipping, distortion, etc.

hornej · 2023-12-15T22:01:18Z

Would it be terribly difficult to send the audio to something like an Airplay speaker? I have some speakers running HiFiBerryOS which support Airplay among other things like Snapcast, DLNA, etc.

This is probably out of the scope/vision for Willow but what I would really love is to be able to manage all my audio inputs (mics) and outputs (speakers) in a way so that when the wake word is detected the associated speaker(s) volume goes down. And the TTS or confirmation can be played through relevant speakers as well. This could also enable functionality like an intercom. Sounds like a lot of work given the challenges around IP audio, but would be pretty awesome. (I believe Apple HomePods have these capabilities)

DanPatten closed this as not planned Won't fix, can't repro, duplicate, stale Jun 10, 2023

kristiankielhofner reopened this Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for external speaker or Bluetooth? #164

Support for external speaker or Bluetooth? #164

DanPatten commented Jun 10, 2023 •

edited

kristiankielhofner commented Jun 13, 2023

tensiondriven commented Jul 1, 2023

kristiankielhofner commented Jul 3, 2023

A6blpka commented Aug 10, 2023

ashishpandey commented Oct 2, 2023

kristiankielhofner commented Oct 3, 2023 •

edited

hornej commented Dec 15, 2023

Support for external speaker or Bluetooth? #164

Support for external speaker or Bluetooth? #164

Comments

DanPatten commented Jun 10, 2023 • edited

kristiankielhofner commented Jun 13, 2023

tensiondriven commented Jul 1, 2023

kristiankielhofner commented Jul 3, 2023

A6blpka commented Aug 10, 2023

ashishpandey commented Oct 2, 2023

kristiankielhofner commented Oct 3, 2023 • edited

hornej commented Dec 15, 2023

DanPatten commented Jun 10, 2023 •

edited

kristiankielhofner commented Oct 3, 2023 •

edited