Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what difference between offline and online? #40

Open
yingfenging opened this issue Dec 31, 2019 · 5 comments
Open

what difference between offline and online? #40

yingfenging opened this issue Dec 31, 2019 · 5 comments

Comments

@yingfenging
Copy link

Hi,what difference between offline and online?In what scenarios should two different types of code be used? Thank you

@LukasDrude
Copy link
Member

Dear Yingfenging,

if you need a dereverberation result as early as possible, e.g., a result of the beginning of the utterance before the speaker stops talking you want to use the online implementation. A good example is the Google Home, where according to their paper, an online WPE implementation is used.

However, if you can wait for the entire utterance to be recorded and you have time to perform dereverberation afterwards, you want the offline implementation. A typical example may be meeting annotation after the meeting is over.

@yingfenging
Copy link
Author

ok,thank you。I noticed that the document(https://pdfs.semanticscholar.org/eac4/1f1625bd1cf2e4981e565664a5a3b1e015a7.pdf) mentioned the performance of 2-8 input channels, but did not mention single channel. Does this program support single channel mode(one input, one out)? What is the performance of a single channel?
thank you.

@LukasDrude
Copy link
Member

Dear Yingfenging,

WPE supports single channel processing. However, WPE performs the better the more channels you have available. In [1] Table 1 compares WPE for 1 up to 8 channels on primarily reverberated data. That may provide you with more insight. Table 2 shows the same analysis on primarily noisy data. There, single channel WPE almost does not help anymore and additional denoising becomes mandatory.

[1] https://groups.uni-paderborn.de/nt/pubs/2018/INTERSPEECH_2018_Drude_Paper.pdf

@yingfenging
Copy link
Author

Hi,
Thank you very much for your answer. I have a few questions for your reference。
1、 In the paper[1] table 1 and table 2 , the wpe is offline wpe or frame online wpe? according to section 3 (baseline:WPE),I think it is offline wpe, is it right ?
2、In the paper[2],table3,the online wpe of 2-channel have a relation WER reduction of 1.14% over the unprocess system ? Does this mean that the performance of 1-channel or 2-channel online dereverberation processing is poor?
3、In the paper[2].figure 3,Why the more channels, the higher real-time factor? I think the more channels there are, the more computation, and the lower real-time factor。

thank you ~

[1] https://groups.uni-paderborn.de/nt/pubs/2018/INTERSPEECH_2018_Drude_Paper.pdf
[2] https://pdfs.semanticscholar.org/eac4/1f1625bd1cf2e4981e565664a5a3b1e015a7.pdf

@LukasDrude
Copy link
Member

Dear @yingfenging

Re: 1.
Indeed, all systems in [1] are offline systems. I referenced [1] to illustrate the importance of using more than one channel and to illustrate how much you need to expect to loose with just one channel. However, the same trend is valid for online WPE: Higher gains are to be expected the more channels you use.

Re: 2.
Yes. When using the simple "smoothing" strategy, the 1-channel and 2-channel online dereverberation has a very limited effect. You can compare [3]: there the authors argue, that online WPE has a limited but consistent effect on WER of their Google Home device.

Re: 3.
In [4] Table 1 they define real time factor as follows: Real-time Factor (CPU Time / Audio Time)
Consequently, the real time factor is higher, when more computation is needed, which aligns with our figure in [2].

Best wishes
Lukas

[1] https://groups.uni-paderborn.de/nt/pubs/2018/INTERSPEECH_2018_Drude_Paper.pdf
[2] https://pdfs.semanticscholar.org/eac4/1f1625bd1cf2e4981e565664a5a3b1e015a7.pdf
[3] https://pdfs.semanticscholar.org/d07b/7131bc7690e5da61e0bc806def6ffd713968.pdf
[4] https://www.isca-speech.org/archive/archive_papers/icslp_2002/i02_1741.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants