[FR] Preprocess voice in the client and only send data when spoken #15

HarikalarKutusu · 2022-04-05T07:24:57Z

Currently, all sounds are streamed continuously, which also hogs the server. In a normal chess play there are 30-50 moves * 2 per game in 20-120 minutes. Assuming 30 minutes/game with 40 moves/player and 4 secs per command, that would mean less than 10%.

If we can implement this:

Any silence, background noise etc will be filtered out.
Only relevant data will be sent, so less communication on both sides
We can use the same server for multiple connections by queuing incoming packets. In this case, a server can become a "language server" with (say) 5-10 connection/inference capacity (which can be adaptive). This is as indicated in [FR] Increase server resources - the scaling problem #4 , third option.

This can be achieved by measuring the sound level/energy with pre-buffering.

One downside to this is: The client will be processing audio continuously, which can be bad for future mobile users in terms of battery usage.

HarikalarKutusu added enhancement New feature or request backend Server related frontend Client related labels Apr 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Preprocess voice in the client and only send data when spoken #15

[FR] Preprocess voice in the client and only send data when spoken #15

HarikalarKutusu commented Apr 5, 2022 •

edited

[FR] Preprocess voice in the client and only send data when spoken #15

[FR] Preprocess voice in the client and only send data when spoken #15

Comments

HarikalarKutusu commented Apr 5, 2022 • edited

HarikalarKutusu commented Apr 5, 2022 •

edited