Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mycroft's blocking of curse words **** interfere with searches and other functionality #1221

Open
InconsolableCellist opened this issue Nov 11, 2017 · 9 comments
Labels
hacktoberfest help wanted Type: Bug - quick Bug fixes that are quick to review and the implications of the change are clear and contained.

Comments

@InconsolableCellist
Copy link
Contributor

(For this bug report #### will indicate my own self-selected censorship, as I don't know the policies of this project regarding cursing. **** will indicate Mycroft censoring words.)

While testing a skill I realized that somewhere in the parsing of my input it’s turning detected curse words into asterisks, such as the query "#### you" being interpreted as “**** you”. This may be a reasonable default, but I want to play albums that contain explicit titles, and this feature breaks that functionality.

This seems to affect the core, not just a third party skill:

Steps to reproduce:

  1. Say a curse word after the wake word. E.g., "hey Mycroft, #### you." (If you were searching for a song title you could be saying something like, "hey Mycroft, play #### the police by NWA")

Observed behavior:
Mycroft reports and interprets "#### you" as "**** you."

Expected behavior:
Mycroft doesn't censor curse words, as they're necessary for playing songs with explicit titles. Optionally, this should be a configurable and documented behavior.

16:50:56.682 - mycroft.client.speech.listener:transcribe:144 - DEBUG - STT: f*** you                               
16:50:56.682 - __main__:handle_utterance:55 - INFO - Utterance: [u'f*** you'] 
@InconsolableCellist
Copy link
Contributor Author

(This is actually a real bug despite my newness to the project and its unusual nature.)

@tjoen
Copy link

tjoen commented Nov 13, 2017

I ran into this too. Would be much better to have it just as an option instead of as a default.

@forslund
Copy link
Collaborator

This was up for discussion last week, I think the conclusion was to make changes to allow this to be turned off. @matheuslima can you comment on this?

@TodaysITSolutions
Copy link

I am from Jersey and swear a lot, this is a problem for me too.

@megaeverything
Copy link

any progress on this issue? the censoring is really annoying.

@krisgesling
Copy link
Contributor

Hey I wasn't around when this issue first got raised so wasn't part of those discussions, but this is actually the Google STT service that we use doing the censoring. Would need to see if there's a flag we can set on the requests to turn it off. If anyone knows already, please chime in.

@krisgesling krisgesling added help wanted Type: Bug - quick Bug fixes that are quick to review and the implications of the change are clear and contained. hacktoberfest labels Sep 24, 2020
@KathyReid
Copy link
Contributor

From a very brief skim of this issue I've been able to determine the following:

  • Most STT services that Mycroft supports (with Google STT currently being the default) have a profanity_filter flag which is passed to the API.

  • In Mycroft's STT classes, this is set to to false for the IBMWatson STT class as per: this line of code, however this parameter does not appear to be set for the GoogleSTT class.

  • In the GoogleSTT class, this parameter does not appear to be set, and I think this is the root cause of this Issue. These are the docs for Google's STT - the parameter is called ProfanityFilter.

  • However, I don't think the answer is just to set profanity_filter to be false in the GoogleSTT class. I think that we should give users the ability to set this on a per-device basis, just as Wake Words and STT engines and TTS voices can be set on a per-device basis at: https://account.mycroft.ai/devices/

  • Therefore I think this requires changes to the Mycroft Home backend to have an ideal implementation.

What I tried to do as a workaround was implement a new self.config variable in mycroft.conf:

  // Profanity filter
  "profanity_filter": false,

This then requires support in the STT classes, ie this is what I tried in the STT base class, but it didn't work;

class STT(metaclass=ABCMeta):
    """ STT Base class, all  STT backends derives from this one. """
    def __init__(self):
        config_core = Configuration.get()
        self.lang = str(self.init_language(config_core))
        config_stt = config_core.get("stt", {})
        self.config = config_stt.get(config_stt.get("module"), {})
        self.credential = self.config.get("credential", {})
        self.recognizer = Recognizer()
        self.can_stream = False
        # set profanity filter
        self.profanity_filter = self.config.get('profanity_filter')

    @staticmethod
    def init_language(config_core):
        lang = config_core.get("lang", "en-US")
        langs = lang.split("-")
        if len(langs) == 2:
            return langs[0].lower() + "-" + langs[1].upper()
        return lang

    @abstractmethod
    def execute(self, audio, language=None, ProfanityFilter=self.profanity_filter):
        pass

(At this point my microphone stopped working with Mycroft for some strange reason, and nothing I did could get it to pick up the microphone again, so I couldn't continue testing)

This didn't work - the ProfanityFilter is still set to True, and the *** remain. But, this might be a clue for others who want to tackle this.

@forslund
Copy link
Collaborator

forslund commented Oct 4, 2020

I tested the google STT module and profanity filter seems to be off by default, but requires one self to have a google cloud account to use.

The API used by the Mycroft backend (which is not the google cloud Speech to Text service, but another of Google's older APIs) always has it enabled and doesn't allow turning it off if I recall correctly.

A config setting is probably a good idea though. Default should be off in my opinion.

@JarbasAl
Copy link
Contributor

this issue probably should be moved to selene repo since thats where STT happens

this setting is not supported by the speech recognition package, but can be enabled/disabled if you use the api directly

to disable profanity filter see https://github.com/OpenVoiceOS/ovos-stt-plugin-chromium , this is what we use in the ovos local backend (no mycroft needed for ovos plugins)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest help wanted Type: Bug - quick Bug fixes that are quick to review and the implications of the change are clear and contained.
Projects
None yet
Development

No branches or pull requests

8 participants