Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounty: Implement noise cancellation on RPi-3 based hardware devices (Mark 1 and Picroft) #1478

Open
KathyReid opened this issue Mar 14, 2018 · 20 comments
Labels
Difficulty: medium help wanted Type: Enhancement - proposed New proposal for a feature that is not currently a priority on the roadmap.

Comments

@KathyReid
Copy link
Contributor

KathyReid commented Mar 14, 2018

NOTE: This issue supercedes Issue #57

Problem statement

The current audio bus on the Mark 1 and Picroft images does not eliminate the speaker audio from the microphone. This leads to undesirable device behavior, most noticeably when an audio stream is playing and the user is unable to “barge in” easily with a Hey Mycroft.

The device is aware of what audio is being output from the speaker. The essential idea desired is to subtract the speaker audio-out from the microphone audio-in using an appropriate approach - such as time-shifting the outbound audio and matching it to the audio in from the microphone.

Acceptance criteria

  • The solution must work on a Mark 1 reference hardware device. Picroft is OK for testing or proof of concept, but the solution must work in a Mark 1 enclosure acoustic environment
  • The solution must work with an audio stream that is being played at 3/4 volume, such as Pandora, Spotify, Mopidy or other streaming audio
  • The solution must work with the default Precise Wake Word detection software.
  • A user must be able to interrupt the audio input/output stream by speaking the Wake Word - ie ‘Hey Mycroft’ at normal volume (ie not shouting).
  • The solution must work within the CPU limitations of RPi 3 hardware (the hardware used for both Mark 1 and Picroft). Namely, not exceeding a 3.0 load average when running the top command.

Useful information

Key technical contact - Steve Penrod (@penrods) (@steve-mycroft at https://chat.mycroft.ai)

Bounty

The Bounty for this feature request is $USD1000, as well as a free Mark 1 and a Gold Mycroft Challenge Coin.

@stephanelpaul
Copy link

I'm going to take a look at this shortly

@pcwii
Copy link

pcwii commented Mar 19, 2018

More helpful information:
PulseAudio supports module-echo-cancelation.
More information here...https://arunraghavan.net/2016/05/improvements-to-pulseaudios-echo-cancellation/

@el-tocino
Copy link
Contributor

Some hopefully useful links about the pulse module:
https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#index45h3
https://wiki.archlinux.org/index.php/PulseAudio/Troubleshooting#Enable_Echo.2FNoise-Cancelation
The echo cancellation module can also do beamforming...

@pcwii
Copy link

pcwii commented Mar 19, 2018

@KathyReid @penrods
Has anyone explored this option (pulse audio echo cancelation) previously? I am willing to give it a go although I only have a picroft to work with.

@forslund
Copy link
Collaborator

I believe it was tried a couple of years ago but the cpu strain was quite high. (This is what I've heard so no personal experience on the Pi). The pulse audio echo cancellation works great on my workstation so it'd be cool if it could work on the Pi as well. If it's too intensive on the hardware maybe there are tweaks that can be made.

Give it a try, and see what the result is!

@roadriverrail
Copy link

I've worked on projects using a Broadcom chipset not unlike that of the BCM2837 (which is used in RPi3) and we'd seen good success using the Opus echo canceler. It does take CPU to do, but it wasn't particularly bad. Unfortunately, I don't have the necessary free time to contribute to the bounty hunt, but I thought perhaps suggesting this would help someone else.

@KathyReid
Copy link
Contributor Author

Thanks for your feedback, @roadriverrail - great suggestion!

@el-tocino
Copy link
Contributor

el-tocino commented Apr 21, 2018

Potentially interesting:
https://github.com/xiph/rnnoise
and based on that:
https://github.com/werman/noise-suppression-for-voice
(the above are significantly slower than viable, alas: ~8:1 increase in processing)

@tlc
Copy link

tlc commented Apr 24, 2018

@forslund, When working on a workstation with the mycroft source, does pulse echo cancellation get loaded automatically or do we have to do that ourselves?

Do USB speakerphone devices such as the Jabra 410 (popular in the forums) do echo cancellation? I'm using one with a RPi 3B+ and "Hey Mycroft, stop" seems to work. Although, I'm not sure if it works "well" at "normal volume".

@el-tocino
Copy link
Contributor

Currently, no distros load the pulse echo cancellation (that I know of).
Per https://www.jabra.com/business/speakerphones/jabra-speak-series/jabra-speak-410 "Digital Signal Processing (DSP ) technology
Crystal clear sound without echoes or or distorted sounds even at max volume level" which sounds a lot like it has some sort of echo canceling.

@forslund
Copy link
Collaborator

@tlc as @el-tocino states the echo cancellation isn't loaded by default. Loading it creates a virtual microphone that you need to set as default to use with mycroft. (basically selecting it in the pulse audio volume control)

@KathyReid
Copy link
Contributor Author

How are we all going with this one - any questions? Any information we could provide to help?

@j1nx
Copy link

j1nx commented Aug 23, 2018

Not my work, but just ran into it;

https://github.com/voice-engine/ec

Looks interesting and ticking the boxes.

@domcross
Copy link
Contributor

I have experimented with voice-engine/ec (which is basically a wrapper for speex) and PulseAudio's echo-cancel module (you have to install PA 7.1 from the Debian-Jessie-Backports for that) using algorithms "webrtc" and "speex" (adrian is not usable at all) but had no luck so far. I see mainly two reasons:

  1. when music is played over the Mark-I speaker the mic of the Mark-I almost only picks up the music (this is caused of the physical construction), in addition the mic/preamp picks up a lot of electric/radio noise. This makes it really tough for any noise/echo-cancel algorithm.
  2. The RPI3 timing of the internal clock is not stable enough for this kind of realtime processing - the permanent timedrift confuses the echo-cancel algorithms as well.
    I will give "rnnoise" a try shortly (have it already compiled for RPI but some problems configuring it for PA) but don't have to high exspectation for the above reasons

@penrods
Copy link
Contributor

penrods commented Aug 26, 2018

I'd be willing to consider a solution that requires a minor and cheap add-on or modification to the Mark 1, e.g. acoustic foam separating the mic and speaker or wire rerouting. But not board level changes.

@el-tocino
Copy link
Contributor

Beamforming based on the mic position plus a cheapo usb mic might be an option. One or two of these mini mics (search "overfly portable usb 2.0 mic") set in the ports combined with the audio from the existing mic run through a beamformer should be able to do aec and improve listening. I haven't tried it myself yet, alas.

@domcross
Copy link
Contributor

After some more experimenting I have a configuration with the PulseAudio echo-cancel module that works reasonably* with volume levels up to 5 (Mark-1's maximum is 11) within a distance of approx. 4 feet. There is some more room for tweaking parameters that might increase reliability.
I didn't try the hardware tweaking (acoustic foam) yet. In addition I am considering changes in Mycroft Audioservices, e.g. duck/mute music as soon as wake-word is detected in order to get a clean utterance...

*depends on the music material, the more compressed (see "loudness war") the less reliable it works.

@j1nx
Copy link

j1nx commented Aug 31, 2018

I believe @forslund already did some work on the ducking part. Believe it is already in PR / Issue section somewhere.

With you that AEC has to be combined with audio ducking.

@el-tocino
Copy link
Contributor

I used some door/window insulating foam (similar: https://www.homedepot.com/p/Frost-King-3-4-in-x-5-16-in-x-10-ft-Black-Rubber-Foam-Weatherseal-Tape-R534H/202262324) to make a barrier around the front of the mic between the face circuitboard and the faceplate. Secondarily to that, I covered the back of the speaker with foam as well.

@krisgesling krisgesling added the Type: Enhancement - proposed New proposal for a feature that is not currently a priority on the roadmap. label Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Difficulty: medium help wanted Type: Enhancement - proposed New proposal for a feature that is not currently a priority on the roadmap.
Projects
None yet
Development

No branches or pull requests