Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAD in example doesn't seem to work #70

Open
p-e-w opened this issue May 9, 2022 · 0 comments
Open

VAD in example doesn't seem to work #70

p-e-w opened this issue May 9, 2022 · 0 comments

Comments

@p-e-w
Copy link

p-e-w commented May 9, 2022

When running full_example.py, the speech recognition itself works fine, but the VAD iterator completely fails to detect voice activity, distinguishing only between "sound" and "silence".

My understanding is that audio_iterator should yield a block of audio data if the input contains voice, and None otherwise. If so, this doesn't work on my system. As long as there is any sound being recorded by the microphone at all, the iterator yields audio blocks. I have tested this with snapping my fingers, scratching on the desk, even the background noise of a ceiling fan running – they all cause the iterator to produce blocks. Only virtually total silence produces None.

As a result, the end of phrase isn't detected unless the room is very, very quiet. I have done multiple test recordings from the same microphone setup and found them to be clear and without additional noise. Yet as soon as there is any input above a certain threshold, even if it is obviously non-human in origin, it is classified as voice. A modern VAD should be able to do much better.

Is this actually working for you? What could be the reason for the VAD to fail so completely?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant