Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Echo cancellation doesn't work #1243

Open
3 tasks done
zli18 opened this issue Nov 25, 2019 · 37 comments
Open
3 tasks done

Echo cancellation doesn't work #1243

zli18 opened this issue Nov 25, 2019 · 37 comments

Comments

@zli18
Copy link

zli18 commented Nov 25, 2019

Please read first!

Please use discuss-webrtc for general technical discussions and questions.

  • I have provided steps to reproduce
  • I have provided browser name and version
  • I have provided a link to the sample here or a modified version thereof

Note: If the checkboxes above are not checked (which you do after the issue is posted), the issue will be closed.

Browser affected

Browser name including version (e.g. Chrome 64.0.3282.119)
Chrome on MacOS: Version 78.0.3904.108 (Official Build) (64-bit)
Actually I haven't got it work in any browser, either on Mac or PC.

Description

I am trying to use this sample for echo cancellation, but it doesn't work.
https://webrtc.github.io/samples/src/content/getusermedia/record/

Steps to reproduce

  1. On any laptop, unplug any headphone to make sure audio can play from speakers. (I am using a MBP)
  2. Open https://webrtc.github.io/samples/src/content/getusermedia/record/ in Chrome, and check the "echo cancellation" box.
  3. Start playing any YouTube in another Chrome tab. Make sure the audio is played from speaker.
  4. Start to record video using this sample, accept any permissions for camera and microphone.

Expected results

I expect the recorded video should cancel the audio from YouTube significantly when the "echo cancellation" is checked.

Actual results

The audio from YouTube still got recorded and played back in original volume, regardless of whether "echo cancellation" is checked or not. I haven't made it work in any browser, either on Mac or PC.

@mattemoore
Copy link

+1 on this.

@ahmadhassanch
Copy link

+1 on this

@rikzin
Copy link

rikzin commented Apr 26, 2020 via email

@ptesavol
Copy link

ptesavol commented May 4, 2020

  1. Start playing any YouTube in another Chrome tab. Make sure the audio is played from speaker.

I can confirm this problem, it would be nice to find a solution.

The problem is there even if the video plays in the same tab in a video tag, so it is not an issue about the Youtube video running in another tab.

I came across this problem in my own code when recording audio with MediaRecorder and playing a video/audio using MSE at the same time in the same tab. Could this have to do with the video being played back using MSE? One could assume Youtube would use MSE as well?

@stephenlb
Copy link

stephenlb commented May 7, 2020

Also confirmed. Using echoCancellation: true constraint. This is labeled as supported in Chrome, etc. However it does not reduce echo.

@rikzin
Copy link

rikzin commented May 7, 2020 via email

@stephenlb
Copy link

stephenlb commented May 8, 2020

Why is it supposed to mute another source audio? It is not develped as subtractive noise Cancellation from youtube as a feature. Echo Cancellation is for the feedback loop between the microphone input and speaker output.

On Thu, May 7, 2020, 10:53 Stephen Blum @.***> wrote: Also confirmed. Using echoCancellation: true constraint. This is labeled as supported in Chrome, etc. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1243 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPGYXJ35XCP5GQW4DCLJ4LRQLYSVANCNFSM4JRM6BQQ .

Agreed. This seems like how it should work, right? Audio from speakers is being sent back through microphone, echoing on the recipients device. In a voice conversation, each call party member hears their voice echo. "Echo Cancellation" should remove the echo.

@chrbsg
Copy link

chrbsg commented May 11, 2020

Chrome (both desktop and Android) does not support echo cancellation of any non-webrtc audio - e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=687574 - the only audio that is cancelled is audio received on the RTCPeerConnection audio track.

Safari (both iOS and MacOS) and Android Firefox do cancel non-webrtc audio. Desktop Firefox does not cancel non-webrtc audio (at least on my machine).

Update: apparently Firefox does echo cancellation if the microphone and playback are part of the same audio graph at the native frequency of the audio output - the implicit conversion done by an AudioContext does not count for this purpose - so to get echo cancellation working fully requires explicitly converting playback samples to match the sample rate of the audio output device in Javascript before playback, e.g. by compiling Xiph's resample.c to WASM and using that.

@stephenlb
Copy link

Excellent. @chrbsg 👍 Thank you

@stephenlb
Copy link

Just thinking that the browser should not report "echCancellation" as true or have a way to distinguish level of scope.

@GerryWilko
Copy link

Does anybody have a solution to this issue at the moment? It kind of makes WebRTC a bit pointless as a concept if we dont have a proper way to remove echo. I am experiencing severe spikes of screetching from testing our solution with two MacBooks.

Firefox, Safari and Chrome seem to have the same issue.

@chrbsg
Copy link

chrbsg commented Sep 7, 2020

@GerryWilko The original bug report here was about cancelling echo from audio being played in other tabs (Youtube). If that is your use case, then there is no working, cross-platform solution. But echo cancellation works ok for cancelling the audio in calls (in general - there are some Android phones where the hardware or drivers didn't implement it properly). Try comparing https://appr.tc/?debug=loopback&audio=echoCancellation=true and https://appr.tc/?debug=loopback&audio=echoCancellation=false.

Just a thought, but if your test devices are in the same room, it's likely that there will be feedback between them. This won't be cancelled, and is one of the weaknesses of modern video conferencing (imagine several people around the same table entering the same chat at the same time).

@GerryWilko
Copy link

@chrbsg thanks for that. I have been looking into our echo cancellation issue and it appears it was perhaps related to the audio sample rate. It seems that the significant screetching were experiencing we could hack a fix in by limiting the OPUS bitrate to 8000 manipulating the SDP sent and recieved by the clients.

Not ideal as the audio quality suffers but it solves it for now. My next task is to begin experimenting with the sample rates and work out how I can properly clear off that issue.

I think this was exacerbated by both sides using the same MacBook which both had default selected very high sample rates.

@chrbsg apologies if I dropped into the wrong issue. I was stuggling to find much out there about this issue and my initial thoughts were this was related to the echo cancellation.

Disclaimer: I'm new to much of this stuff so apologies if I am using the wrong terms for things :)

@FullstackJack
Copy link

FullstackJack commented Oct 10, 2020

Since browsers don't seem to implement echoCancellation as meaning to cancel audio coming out from the speakers, does anyone know of source code, white papers, libraries, etc to explain how to cancel the audio from the speakers (i.e. YouTube) in the JavaScript layer? Is this even possible? In latest versions of WebRTC, we can get audio from desktop when requesting displays, perhaps this audio track can be used as source in cancellation?

It should be noted what Google says about their use of echoCancellation: "An echo canceller tries to remove any sound played out on the speakers from the audio signal that's picked up by the microphone." Does it remove the speaker signal from the mic input or does it remove the mic signal from the speaker? This is so confusing.

https://developers.google.com/web/updates/2017/12/disabling-hardware-noise-suppression

@chrbsg
Copy link

chrbsg commented Dec 4, 2020

@FullstackJack it is not possible to read the audio that is being played by other arbitrary tabs in the browser. It would be a security risk if javascript code could snoop on a voice call in a completely different tab.

@gutmann-dev
Copy link

@chrbsg thanks for that. I have been looking into our echo cancellation issue and it appears it was perhaps related to the audio sample rate. It seems that the significant screetching were experiencing we could hack a fix in by limiting the OPUS bitrate to 8000 manipulating the SDP sent and recieved by the clients.

Not ideal as the audio quality suffers but it solves it for now. My next task is to begin experimenting with the sample rates and work out how I can properly clear off that issue.

I think this was exacerbated by both sides using the same MacBook which both had default selected very high sample rates.

@chrbsg apologies if I dropped into the wrong issue. I was stuggling to find much out there about this issue and my initial thoughts were this was related to the echo cancellation.

Disclaimer: I'm new to much of this stuff so apologies if I am using the wrong terms for things :)

Thanks! This is great comment

@rikzin
Copy link

rikzin commented Dec 11, 2020 via email

@keichenblat
Copy link

keichenblat commented Jan 11, 2021

What about echo cancellation of sounds played by <audio> (or even <video>) elements from the very same page where WebRTC is running?

For example: I have an element that plays sound effects. The sound effects are simple .ogg files, and I only want the user of the app to hear them (echoing them is annoying to the other participants)

@keval101
Copy link

just add video.volume = 0 when access camera and also on start recording, thanks me later ! it works for me

@Llorx
Copy link

Llorx commented Oct 24, 2021

just add video.volume = 0 when access camera and also on start recording, thanks me later ! it works for me

How do you hear the audio if you set the volume to 0?

@peterzanetti
Copy link

Here is another valid scenario: I am trying to do transcription (voice-to-text) during a WebRTC call. Works fine except for the fact that there is no AEC outside of the call itself. Despite using getUserMedia() with audio constraints for echoCancellation and noiseSuppression, it has no effect on the transcription, so the remote party's voice coming through the speakers is picked up and transcribed as having come from the recipient instead of the sender (actually both at the same time).

@has-n
Copy link

has-n commented Feb 2, 2022

Chrome (both desktop and Android) does not support echo cancellation of any non-webrtc audio - e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=687574 - the only audio that is cancelled is audio received on the RTCPeerConnection audio track.

Safari (both iOS and MacOS) and Android Firefox do cancel non-webrtc audio. Desktop Firefox does not cancel non-webrtc audio (at least on my machine).

Update: apparently Firefox does echo cancellation if the microphone and playback are part of the same audio graph at the native frequency of the audio output - the implicit conversion done by an AudioContext does not count for this purpose - so to get echo cancellation working fully requires explicitly converting playback samples to match the sample rate of the audio output device in Javascript before playback, e.g. by compiling Xiph's resample.c to WASM and using that.

@chrbsg sorry for necroing an old thread but wanted to pick your brain. Do you think the WASM method would work for AEC instead of resampling?

@chrbsg
Copy link

chrbsg commented Feb 2, 2022

@has-n Do you mean implementing AEC in WASM? I'm not convinced it would do anything more than the existing desktop Chrome AEC. Your code has no access to audio produced by other tabs, so the microphone will still pick it up and send it to the other side (which was the original problem that this issue was created for - playing YouTube in another tab, and expecting the audio to be cancelled after being picked up by the microphone). Also there's the potential issue that AEC is computationally intensive, at least libwebrtc AEC3 is.

The WASM remixing was specifically recommended by Paul Adenot of Mozilla, to workaround the fact that Firefox's AEC only works if the input and output audio are part of the same audio graph, using a single sample rate:

Effectively what I think is happening is that you're ending up setting up two audio graphs inside Firefox by asking for a specific sample-rate on one side and doing a getUserMedia on the other side, and never connecting them. If you try to connect them, there will probably be an error. We're NOT mixing those unrelated graphs yet, so the AEC has a silent reverse stream and it fails completely.

This is being fixed as we speak in multiple ways:

  • We're implementing adaptive resampling and drift compensation to be able to connect unrelated graphs. This will have latency and performance implications. It takes a long time because it's non trivial at the latency figures and quality we need to not regress: Firefox is being used in production by music professionals and we can't justify increasing audio latencies to bring new features.
  • We're investigating using an OS-provided reverse stream (bug is old but the priority has been increased internally because everybody is using webrtc these days) on OSes where it's possible (available and partly implemented on Windows and Linux Desktop).

That said, and without trying to minimize the fact that you're clearly facing a Firefox limitation, it's more efficient (and always will be) to decode and resample the incoming Opus stream into a web worker and to play this out using an AudioWorklet (regardless of our changes). If that's of interest to you I have production ready javascript code that implements a lot of what you need for this (very liberally licensed, that shouldn't be a problem).

If the default sample-rate of the audio output device you're using is 48000Hz, it should work as is. This might be why you hear it working sometimes. You can check the default sample-rate on about:support, in the Media section.

Is it possible to resample a continuous stream with something like an OfflineAudioContext? The examples I've found only show resampling of fixed length audio buffers.

No, get yourself a resampler and use that, from a worker. AudioBufferSourceNode and OfflineAudioContext are well suited for non-live audio. You want to do playback using an AudioWorkletNode. I'd say, compile this to WASM (this is the resampler we use inside Firefox).

I'd be very interested in seeing your code for the opus decode Web Worker / AudioWorklet - we do want to get decoding done on another thread.

My code is about playback of audio content from a worker, and doesn't decode opus per se (it seems you have this part done). https://github.com/padenot/ringbuf.js has lots of docs and examples. It works in Chrome stable, and will work in Firefox 78 (released on the 30th of june) but you can try in Nightly (we're reenabling SharedArrayBuffer). With this, you can have one end of the ringbuffer in the AudioWorklet (basically copy/paste the example, it's dequeuing audio and playing it out), and the writing end is in a worker, that does Fetch calls, and decodes the opus, and resamples to the native audio rate.

If you need something working today, you can write a second path for when SharedArrayBuffer is not supported, using postMessage.

(The original problem that I had was to play an Opus stream with precise control over timing, which meant I couldn't just use the browser to play an RTP audio track. I solved this by compiling libopus to WASM, sending the Opus over a WebRTC datachannel, using libopus to decode the Opus frames, and playing out the resulting PCM using Web Audio. The problem was that Chrome does AEC on the WebRTC RTP tracks and not Web Audio. This was solved by adapting Alex Ciarlillo's loopback hack. This is inefficient, especially for Firefox, where the above solution would be better, but it works as a last resort on both Chrome and Firefox.)

@rafalsk
Copy link

rafalsk commented Apr 20, 2022

Chrome (both desktop and Android) does not support echo cancellation of any non-webrtc audio - e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=687574 - the only audio that is cancelled is audio received on the RTCPeerConnection audio track.
Safari (both iOS and MacOS) and Android Firefox do cancel non-webrtc audio. Desktop Firefox does not cancel non-webrtc audio (at least on my machine).
Update: apparently Firefox does echo cancellation if the microphone and playback are part of the same audio graph at the native frequency of the audio output - the implicit conversion done by an AudioContext does not count for this purpose - so to get echo cancellation working fully requires explicitly converting playback samples to match the sample rate of the audio output device in Javascript before playback, e.g. by compiling Xiph's resample.c to WASM and using that.

@chrbsg sorry for necroing an old thread but wanted to pick your brain. Do you think the WASM method would work for AEC instead of resampling?

(..) noise cancellation for web-rtc streams used to work fine for us, but that is no more in recent versions of Chromium.

@phsultan
Copy link

phsultan commented Jun 9, 2022

This Chrome flag seems to address the issue of extending the scope of audio sources for AEC:
chrome://flags/#chrome-wide-echo-cancellation

Run WebRTC capture audio processing in the audio process instead of the renderer processes, thereby cancelling echoes from more audio sources. – Mac, Windows, Linux, Lacros

Testing a similar use case as yours @peterzanetti, a WebRTC call with the audio from participants being processed by the Chrome's WebSpeech API (where I assume AEC is activated). Without the flag enabled, audio from remote participants is fed back to the local WebSpeech API, which results in two transcripts for the same audio from two participants. And this is fixed if the flag is enabled.

@peterzanetti
Copy link

That's very interesting, I'm going to test this. Strange that this flag even exists, and isn't enabled by default. I can't imagine the value of it being disabled.

@peterzanetti
Copy link

Tested this today and unfortunately did not work for me.

@theicfire
Copy link

fwiw using chrome://flags/#chrome-wide-echo-cancellation does work for me. I'm on MacOS. Seems this thread is related to it: https://bugs.chromium.org/p/chromium/issues/detail?id=1215049

@theicfire
Copy link

theicfire commented Jul 28, 2022

This is perhaps a naive question but how does one test echo cancellation working at all? That is, without having two computers. Even in the case that browsers support.

There's a great Firefox blog post that links to this fiddle. In Firefox, the echo cancellation modification is significant. It infrequently gets caught in a feedback cycle. In Chrome, feedback cycles happen constantly.

Note that the checkboxes don't work in Chrome b/c Chrome doesn't support the applyConstraints API. But by default echo cancellation is on, so I would think it could work as well as Firefox.

But maybe audio feedback cycles are different than echo cancellation?

Here's another example with echoCancellation on. It works far better on Firefox and seems to have no effect on Chrome.

@peterzanetti
Copy link

I tested it in my video conferencing app using 2 computers using Chrome in different locations. Aside from video and audio, there is transcription being done with web speech API.

The simple way to test whether this flag "works" or not is to make sure the audio devices (input and output) are separate devices (not a headset or a hardware device with its own echo cancellation)...like a webcam's mic + computer speakers. With this configuration, if there is no native AEC happening, the transcript will get messed up because User A's microphone will pick up User B's voice coming through the speakers, and translate the voice that it is hearing. So the effect is you get User B's speech transcribed twice, one from User B's mic, and once from User A's mic. The only way to avoid this would be for AEC to actually cancel that audio that is picking up from the computer speakers, just as it does for WebRTC audio.

If there was no AEC for the WebRTC audio channel, then it would be worse of course because the users would hear echo. But of course there is AEC for the WebRTC audio.

@theicfire
Copy link

theicfire commented Jul 28, 2022

Yeah I was specifically curious how to test it without having two computers.

But indeed, with two computers I found a random app to use: https://p2p.mirotalk.com. The source is not webpack'd or anything, so it's easy to go into the source and change echoCancellation to false (using Chrome local overrides)

@sungongwei
Copy link

fwiw using chrome://flags/#chrome-wide-echo-cancellation does work for me. I'm on MacOS. Seems this thread is related to it: https://bugs.chromium.org/p/chromium/issues/detail?id=1215049

This Chrome flag seems to address the issue of extending the scope of audio sources for AEC: chrome://flags/#chrome-wide-echo-cancellation

Run WebRTC capture audio processing in the audio process instead of the renderer processes, thereby cancelling echoes from more audio sources. – Mac, Windows, Linux, Lacros

Testing a similar use case as yours @peterzanetti, a WebRTC call with the audio from participants being processed by the Chrome's WebSpeech API (where I assume AEC is activated). Without the flag enabled, audio from remote participants is fed back to the local WebSpeech API, which results in two transcripts for the same audio from two participants. And this is fixed if the flag is enabled.

work for me

@peterzanetti
Copy link

This Chrome flag seems to address the issue of extending the scope of audio sources for AEC: chrome://flags/#chrome-wide-echo-cancellation

Run WebRTC capture audio processing in the audio process instead of the renderer processes, thereby cancelling echoes from more audio sources. – Mac, Windows, Linux, Lacros

Testing a similar use case as yours @peterzanetti, a WebRTC call with the audio from participants being processed by the Chrome's WebSpeech API (where I assume AEC is activated). Without the flag enabled, audio from remote participants is fed back to the local WebSpeech API, which results in two transcripts for the same audio from two participants. And this is fixed if the flag is enabled.

Is there anything special you had to do to get it working? I don't understand why it doesn't work for me.

@ianido
Copy link

ianido commented Sep 8, 2022

@peterzanetti did you find a way to cancel incoming audio? I am doing a conversational transcription and I am having the same issue, the flag chrome://flags/#chrome-wide-echo-cancellation works if my audio is running in another chrome tab, but I need to use an ipad, no idea how to disable this in an ipad.

@peterzanetti
Copy link

No, I've yet to see that this flag actually have any impact on my described use case.

@fmonterogit
Copy link

just add video.volume = 0 when access camera and also on start recording, thanks me later ! it works for me

Simple and great idea, if you recording a video of yourself, volume is doesn't matter.

@tiennguyen1293
Copy link

So interested!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests