Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for external speaker or Bluetooth? #164

Open
DanPatten opened this issue Jun 10, 2023 · 7 comments
Open

Support for external speaker or Bluetooth? #164

DanPatten opened this issue Jun 10, 2023 · 7 comments

Comments

@DanPatten
Copy link

DanPatten commented Jun 10, 2023

Does the ESP32-S3-BOX support either Bluetooth or the ability to plugin an aux so eventually TTS or Spotify output can go to that?

Thanks

@DanPatten DanPatten closed this as not planned Won't fix, can't repro, duplicate, stale Jun 10, 2023
@kristiankielhofner
Copy link
Contributor

Not only is it planned, it's included in the README.

@tensiondriven
Copy link

I'd love to see some sort of software-only "willow" that would run as a daemon under MacOS/Ubuntu/Window, and act as a bridge to willow-inference-server, so one could use an onboard microphone, external microphone over aux, or potentially bluetooth mic, without the need for an ESP.

@kristiankielhofner
Copy link
Contributor

We've actually been working on this!

At the moment it's a desktop app that uses the WIS WebRTC interface. It's currently hotkey activated (we bind to a user-defined global OS hotkey combination) and doesn't support VAD but it's something we're definitely interested in.

@A6blpka
Copy link

A6blpka commented Aug 10, 2023

I have some bad news. ESP32-S3 does not support classic bluetooth, that means it does not support A2DP. Classic bluetooth is only supported when using classic ESP32.
So we won't be able to use external speakers via classic bluetooth :(

@ashishpandey
Copy link

Is there any possibility to wire up a speaker via the pcie expansion port, gpio or the usb host port on the dock for BOX-3

While the mics are good at picking up voice, the Inbuilt speaker is very faint for most practical feedback.

@kristiankielhofner
Copy link
Contributor

kristiankielhofner commented Oct 3, 2023

The "PCIe" expansion port is not actually electrically PCIe. It was an easy existing connector for Espressif to use for the dock modules. Their materials are pretty confusing in this regard, I initially thought the same thing. Ideally they wouldn't mention "PCIe" at all and simply call it an expansion port or similar.

We are in talks with Espressif to make an "audio dock" or similar that would likely have some kind of electrical pin map to either the DAC built in to the device or a defined external DAC that would expose a 3.5mm connector or similar on the dock. Otherwise we'd be looking at supporting some kind of specific external DAC that users would need to purchase and wire up manually to SPI, I2C, etc via the GPIO ports or support for the USB-A host port that would introduce similar variability with various USB audio devices on the market.

There is also some discussion for this audio dock (or another variant) including an additional bluetooth chip that can support "full Bluetooth" for A2DP, etc that you could use with an external bluetooth speaker.

We completely understand that the built-in speaker is lacking for some users but this is one of the cost saving measures deployed that makes it possible to provide otherwise very capable hardware at a ~$50 price point. That said as the (lack of) activity on this issue demonstrates many users are fine with the built in speaker which is why the priority for us and Espressif has been on an external dock for connection to a user-provided and preferred speaker so users can spend the extra money for the audio dock to support their use case (as opposed to increasing the price, size, etc for everyone). Even if the speaker included with the hardware was more capable there will always be users that will find it inadequate compared to what they prefer for their intended use cases offering superior audio output. You can see the same thing in the Echo ecosystem.

Also, generally speaking this opens up a little bit of a can of worms as using random speakers with full-duplex use cases (playing music and attempting to issue commands, for example) involves some acoustic issues that we will have to account for to provide a similar level of high quality speech recognition.

I should also note that we have a to-do item to normalize amplitude across the response types. WIS TTS, for example, is very faint and the audio levels for TTS in WIS can be boosted to provide higher volume output, we just need to make sure we don't overdrive the speaker and enable the potential for audio clipping, distortion, etc.

@hornej
Copy link

hornej commented Dec 15, 2023

Would it be terribly difficult to send the audio to something like an Airplay speaker? I have some speakers running HiFiBerryOS which support Airplay among other things like Snapcast, DLNA, etc.

This is probably out of the scope/vision for Willow but what I would really love is to be able to manage all my audio inputs (mics) and outputs (speakers) in a way so that when the wake word is detected the associated speaker(s) volume goes down. And the TTS or confirmation can be played through relevant speakers as well. This could also enable functionality like an intercom. Sounds like a lot of work given the challenges around IP audio, but would be pretty awesome. (I believe Apple HomePods have these capabilities)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants