Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text pre-intent parsing #3112

Open
NeonDaniel opened this issue Jun 3, 2022 · 2 comments
Open

Text pre-intent parsing #3112

NeonDaniel opened this issue Jun 3, 2022 · 2 comments

Comments

@NeonDaniel
Copy link
Member

NeonDaniel commented Jun 3, 2022

Is your feature request related to a problem? Please describe.
Adding a plugin-based method for manipulating transcriptions from STT before passing them to the intent service would allow for co-reference resolution, number normalization, expanding contractions, translation, and any other parsing to help intent engines.

Describe the solution you'd like
This is implemented in Neon and the plugin base class is defined in neon-transformers. I think the simplest implementation is the one in Neon.

Describe alternatives you've considered
It might be more logical to have the parser service handle recognizer_loop:utterance and emit the result to the intent service (mycroft.utterance, mycroft.parsed_utterance?). This would allow for Messages to bypass text parsing if there was a reason to go straight to the intent service.

Additional context
Potential partial solution to #1221
This was discussed briefly in the forum https://community.mycroft.ai/t/proposal-for-organizing-functionality-in-mycroft-core/11519/7

@JarbasAl
Copy link
Contributor

JarbasAl commented Jun 3, 2022

unrelated to #1221 , thats just google STT blocking words because of the implementation on selene side. with the chromium plugin it can be disabled so that code could be ported to selene if desired

this suggestion could be used for the opposite! censor curse words in clean text, but its hard to go from **** to the source text

@NeonDaniel
Copy link
Member Author

unrelated to #1221 , thats just google STT blocking words because of the implementation on selene side. with the chromium plugin it can be disabled so that code could be ported to selene if desired

this suggestion could be used for the opposite! censor curse words in clean text, but its hard to go from **** to the source text

Should have elaborated, I meant it could be disabled in Selene and censoring implemented as a plugin. I assumed the rationale for filtering in Selene is to prevent Mycroft from transcribing curse words

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants