Rhasspy (2.4) 'no' word recognition problems especially with female voices #242

michelepanegrossi · 2020-06-19T00:47:20Z

Hello

I am experiencing an issue where it is really hard for rhasspy to recognize the word no especially when pronounced by a female voice (sometimes 0% rate over 10 utterances).

My sentences.ini file:

[yesAnswer]
yes
yep
yeah
yes please

[noAnswer]
no
no thanks
no thank you

no thanks and no thank you work a lot better even with female voices

My setup is as follow: rhasspy is running as a service on my raspberry PI4. I have another python3 script which is controlling rhasspy via the HTTP API, receives the transcript and forwards that over to another machine. This other script also runs at startup as a service.

I send messages to that script and ask it to query Rhasspy over HTTP.

This is the log of one of my requests

[INFO:304795] quart.serving: 127.0.0.1:39554 POST /api/stop-recording 1.1 200 292 215383
[DEBUG:304792] InboxActor: -> stopped
[DEBUG:304789] main: {"intent": {"name": "noAnswer", "confidence": 1.0}, "entities": [], "text": "no", "raw_text": "no", "recognize_seconds": 0.00034929599996758043, "tokens": ["no"], "raw_tokens": ["no"], "wav_seconds": 0.0, "transcribe_seconds": 0.0, "speech_confidence": 0.1044066508421162, "slots": {}, "wakeId": "", "siteId": "default"}
[DEBUG:304788] InboxActor: -> stopped
[DEBUG:304785] main: no
[DEBUG:304784] InboxActor: -> stopped
[DEBUG:304782] PocketsphinxDecoder: no
[DEBUG:304781] PocketsphinxDecoder: Transcription confidence: 0.1044066508421162
[DEBUG:304780] PocketsphinxDecoder: Decoded WAV in 0.18922734260559082 second(s)
[DEBUG:304589] PocketsphinxDecoder: rate=16000, width=2, channels=1.
[DEBUG:304585] main: Recorded 137324 byte(s) of audio data
[DEBUG:304584] InboxActor: -> stopped
[INFO:300307] quart.serving: 127.0.0.1:39550 POST /api/start-recording 1.1 200 2 7190

What can I do to solve this issue?

At the moment I am using Pocketsphinx. I noticed that Rhasspy 2.5 also now supports Deepspeech. Would I get a better result switching to a different recogniser such as Kaldi or Deepspeech?

The text was updated successfully, but these errors were encountered:

keith721 · 2020-06-28T15:02:11Z

When I previously had word recognition problems in Rhasspy 2.4.x, I switched from pocketSphinx to kaldi, and things got much better. At the time, Rhasspy was having trouble differentiating between 'on' and 'off', quite a bother.

michelepanegrossi · 2020-06-28T15:42:52Z

Yes I had the same experience! I switched to Rhasspy 2.5 too. I started with Deepspeech as I was hoping that would be the answer, but after trying Kaldi I now think that engine is the best solution so far. The issue with 'yes' and 'no' seems solved.

However, at the moment I am looking for a way to avoid triggering an intent when a random word is spoken. In my experience Rhasspy tries to force an intent when any word is spoken, not just words that appear in the sentences.ini file. The result is that intents are triggering with random words.

Do you know if there is a way of having some kind of 'default' or 'fallback' intent to handle this kind of situation?

synesthesiam · 2020-07-07T19:37:32Z

See https://community.rhasspy.org/t/is-there-a-way-to-setup-a-default-or-fallback-intent/1198 for some ideas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rhasspy (2.4) 'no' word recognition problems especially with female voices #242

Rhasspy (2.4) 'no' word recognition problems especially with female voices #242

michelepanegrossi commented Jun 19, 2020

keith721 commented Jun 28, 2020

michelepanegrossi commented Jun 28, 2020

synesthesiam commented Jul 7, 2020

Rhasspy (2.4) 'no' word recognition problems especially with female voices #242

Rhasspy (2.4) 'no' word recognition problems especially with female voices #242

Comments

michelepanegrossi commented Jun 19, 2020

keith721 commented Jun 28, 2020

michelepanegrossi commented Jun 28, 2020

synesthesiam commented Jul 7, 2020