Mycroft's blocking of curse words `****` interfere with searches and other functionality #1221

InconsolableCellist · 2017-11-11T22:06:34Z

(For this bug report #### will indicate my own self-selected censorship, as I don't know the policies of this project regarding cursing. **** will indicate Mycroft censoring words.)

While testing a skill I realized that somewhere in the parsing of my input it’s turning detected curse words into asterisks, such as the query "#### you" being interpreted as “**** you”. This may be a reasonable default, but I want to play albums that contain explicit titles, and this feature breaks that functionality.

This seems to affect the core, not just a third party skill:

Steps to reproduce:

Say a curse word after the wake word. E.g., "hey Mycroft, #### you." (If you were searching for a song title you could be saying something like, "hey Mycroft, play #### the police by NWA")

Observed behavior:
Mycroft reports and interprets "#### you" as "**** you."

Expected behavior:
Mycroft doesn't censor curse words, as they're necessary for playing songs with explicit titles. Optionally, this should be a configurable and documented behavior.

16:50:56.682 - mycroft.client.speech.listener:transcribe:144 - DEBUG - STT: f*** you                               
16:50:56.682 - __main__:handle_utterance:55 - INFO - Utterance: [u'f*** you']

The text was updated successfully, but these errors were encountered:

InconsolableCellist · 2017-11-11T22:07:19Z

(This is actually a real bug despite my newness to the project and its unusual nature.)

tjoen · 2017-11-13T11:21:55Z

I ran into this too. Would be much better to have it just as an option instead of as a default.

forslund · 2017-11-13T11:24:02Z

This was up for discussion last week, I think the conclusion was to make changes to allow this to be turned off. @matheuslima can you comment on this?

TodaysITSolutions · 2017-11-22T16:42:51Z

I am from Jersey and swear a lot, this is a problem for me too.

megaeverything · 2019-08-31T04:07:42Z

any progress on this issue? the censoring is really annoying.

krisgesling · 2019-09-02T04:28:24Z

Hey I wasn't around when this issue first got raised so wasn't part of those discussions, but this is actually the Google STT service that we use doing the censoring. Would need to see if there's a flag we can set on the requests to turn it off. If anyone knows already, please chime in.

KathyReid · 2020-10-04T01:22:02Z

From a very brief skim of this issue I've been able to determine the following:

Most STT services that Mycroft supports (with Google STT currently being the default) have a profanity_filter flag which is passed to the API.
In Mycroft's STT classes, this is set to to false for the IBMWatson STT class as per: this line of code, however this parameter does not appear to be set for the GoogleSTT class.
In the GoogleSTT class, this parameter does not appear to be set, and I think this is the root cause of this Issue. These are the docs for Google's STT - the parameter is called ProfanityFilter.
However, I don't think the answer is just to set profanity_filter to be false in the GoogleSTT class. I think that we should give users the ability to set this on a per-device basis, just as Wake Words and STT engines and TTS voices can be set on a per-device basis at: https://account.mycroft.ai/devices/
Therefore I think this requires changes to the Mycroft Home backend to have an ideal implementation.

What I tried to do as a workaround was implement a new self.config variable in mycroft.conf:

  // Profanity filter
  "profanity_filter": false,

This then requires support in the STT classes, ie this is what I tried in the STT base class, but it didn't work;

class STT(metaclass=ABCMeta):
    """ STT Base class, all  STT backends derives from this one. """
    def __init__(self):
        config_core = Configuration.get()
        self.lang = str(self.init_language(config_core))
        config_stt = config_core.get("stt", {})
        self.config = config_stt.get(config_stt.get("module"), {})
        self.credential = self.config.get("credential", {})
        self.recognizer = Recognizer()
        self.can_stream = False
        # set profanity filter
        self.profanity_filter = self.config.get('profanity_filter')

    @staticmethod
    def init_language(config_core):
        lang = config_core.get("lang", "en-US")
        langs = lang.split("-")
        if len(langs) == 2:
            return langs[0].lower() + "-" + langs[1].upper()
        return lang

    @abstractmethod
    def execute(self, audio, language=None, ProfanityFilter=self.profanity_filter):
        pass

(At this point my microphone stopped working with Mycroft for some strange reason, and nothing I did could get it to pick up the microphone again, so I couldn't continue testing)

This didn't work - the ProfanityFilter is still set to True, and the *** remain. But, this might be a clue for others who want to tackle this.

forslund · 2020-10-04T15:49:53Z

I tested the google STT module and profanity filter seems to be off by default, but requires one self to have a google cloud account to use.

The API used by the Mycroft backend (which is not the google cloud Speech to Text service, but another of Google's older APIs) always has it enabled and doesn't allow turning it off if I recall correctly.

A config setting is probably a good idea though. Default should be off in my opinion.

JarbasAl · 2021-11-17T01:21:43Z

this issue probably should be moved to selene repo since thats where STT happens

this setting is not supported by the speech recognition package, but can be enabled/disabled if you use the api directly

to disable profanity filter see https://github.com/OpenVoiceOS/ovos-stt-plugin-chromium , this is what we use in the ovos local backend (no mycroft needed for ovos plugins)

krisgesling added help wanted Type: Bug - quick Bug fixes that are quick to review and the implications of the change are clear and contained. hacktoberfest labels Sep 24, 2020

NeonDaniel mentioned this issue Jun 3, 2022

Text pre-intent parsing #3112

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mycroft's blocking of curse words `****` interfere with searches and other functionality #1221

Mycroft's blocking of curse words `****` interfere with searches and other functionality #1221

InconsolableCellist commented Nov 11, 2017

InconsolableCellist commented Nov 11, 2017

tjoen commented Nov 13, 2017

forslund commented Nov 13, 2017

TodaysITSolutions commented Nov 22, 2017

megaeverything commented Aug 31, 2019

krisgesling commented Sep 2, 2019

KathyReid commented Oct 4, 2020

forslund commented Oct 4, 2020

JarbasAl commented Nov 17, 2021

Mycroft's blocking of curse words **** interfere with searches and other functionality #1221

Mycroft's blocking of curse words **** interfere with searches and other functionality #1221

Comments

InconsolableCellist commented Nov 11, 2017

InconsolableCellist commented Nov 11, 2017

tjoen commented Nov 13, 2017

forslund commented Nov 13, 2017

TodaysITSolutions commented Nov 22, 2017

megaeverything commented Aug 31, 2019

krisgesling commented Sep 2, 2019

KathyReid commented Oct 4, 2020

forslund commented Oct 4, 2020

JarbasAl commented Nov 17, 2021

Mycroft's blocking of curse words `****` interfere with searches and other functionality #1221

Mycroft's blocking of curse words `****` interfere with searches and other functionality #1221