Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

HideyoshiNakazone · 2024-02-15T19:56:48Z

Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases

HideyoshiNakazone · 2024-02-22T00:36:13Z

Hello @Uberi and @ftnext, i was wondering if it's possible for someone to review my merge request.

Thank you very much,
Vitor Hideyoshi.

HideyoshiNakazone · 2024-04-22T20:18:16Z

Hello @ftnext, is there any interest in this feature? It doesn't break any of GoogleCloudSpeech python api, only extends it. I'm currently already using this implementation in the company i work in, but would love to have this feature merged.
If there is anything blocking the merge please tell me :)

Uberi · 2024-04-26T18:39:31Z

Hi @HideyoshiNakazone!

Looks good overall, but would it be possible to document these parameters in the docs for that function? If so, happy to merge this!

…ry Reference File

HideyoshiNakazone · 2024-04-26T19:40:41Z

@Uberi, thanks a lot! I added the parameters to the Docstring of the method Recognizer.recognize_google_cloud and added them to the library reference file.
If there is any other places you'd like me to add documentation i'll be happy to :)

ftnext · 2024-04-29T15:16:53Z

reference/library-reference.rst

@@ -238,6 +238,10 @@ The recognition language is determined by ``language``, which is a BCP-47 langua

 If ``preferred_phrases`` is an iterable of phrase strings, those given phrases will be more likely to be recognized over similar-sounding alternatives. This is useful for things like keyword/command recognition or adding new phrases that aren't in Google's vocabulary. Note that the API imposes certain `restrictions on the list of phrase strings <https://cloud.google.com/speech/limits#content>`__.

+The ``use_enhanced`` is a boolean option that sets a flag with the same name on the Google Cloud Speech API, it will make the API uses the enhanced version of the model. More information can be found in the `Google Cloud Speech API documentation <https://cloud.google.com/speech-to-text/docs/enhanced-models>` __.


@HideyoshiNakazone Thanks! Would you like to remove space?

-<https://cloud.google.com/speech-to-text/docs/enhanced-models>` __ +<https://cloud.google.com/speech-to-text/docs/enhanced-models>`__

ftnext · 2024-04-29T15:35:11Z

@HideyoshiNakazone Thank you very much for this pull request! I'm very sorry to respond too late.
@Uberi Thanks your comment!

In my opinion, it seems to be better to introduce keyword arguments (a.k.a. **kwargs)
https://docs.python.org/3/tutorial/controlflow.html#keyword-arguments

Certainly, adding use_enhanced and model as arguments would implement this feature.
However, if there are additional arguments to be added in the future, there is a concern that they could be added again (not easy to extend).

I think it would be preferable for Cloud Speech API-specific arguments to be specified as variant keyword arguments.

def recognize_google_cloud(self, audio_data, credentials_json=None, language="en-US", preferred_phrases=None, show_all=False, **api_params):
    """
    If ``preferred_phrases`` is an iterable of phrase strings, ...

    api_params: Cloud Speech API-specific parameters as dict (optional)

        The ``use_enhanced`` is a boolean option ...

        Furthermore, you can use the option ``model`` to set your desired model,

    Returns the most likely transcription if ``show_all`` is False (the default).
    """

    config = {
        'encoding': speech.RecognitionConfig.AudioEncoding.FLAC,
        'sample_rate_hertz': audio_data.sample_rate,
        'language_code': language,
        **api_params,
    }

(It seems that preferred_phrases might be included in api_params too, but this is another issue)

Adds Parameter use_enhanced and model to GoogleCloudSpeech

c845904

Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases

HideyoshiNakazone mentioned this pull request Feb 15, 2024

Feature Request: GoogleCloudSpeech - Add method parameters to set use_enhanced and model options #734

Open

Adds Parameters use_enhanced and model to GoogleSpeechAPI docstring

8e0fa40

HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from 052dec3 to 8e0fa40 Compare April 26, 2024 19:13

HideyoshiNakazone added 2 commits April 26, 2024 19:27

Adds Missing Models to Docstring and Adds Missing Parameters to Libra…

daca000

…ry Reference File

Fixes Broken Formatting

abb35fe

ftnext reviewed Apr 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

HideyoshiNakazone commented Feb 15, 2024

HideyoshiNakazone commented Feb 22, 2024 •

edited

HideyoshiNakazone commented Apr 22, 2024

Uberi commented Apr 26, 2024

HideyoshiNakazone commented Apr 26, 2024

ftnext Apr 29, 2024

ftnext commented Apr 29, 2024

		@@ -238,6 +238,10 @@ The recognition language is determined by ``language``, which is a BCP-47 langua

		If ``preferred_phrases`` is an iterable of phrase strings, those given phrases will be more likely to be recognized over similar-sounding alternatives. This is useful for things like keyword/command recognition or adding new phrases that aren't in Google's vocabulary. Note that the API imposes certain `restrictions on the list of phrase strings <https://cloud.google.com/speech/limits#content>`__.

		The ``use_enhanced`` is a boolean option that sets a flag with the same name on the Google Cloud Speech API, it will make the API uses the enhanced version of the model. More information can be found in the `Google Cloud Speech API documentation <https://cloud.google.com/speech-to-text/docs/enhanced-models>` __.

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

Are you sure you want to change the base?

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

Conversation

HideyoshiNakazone commented Feb 15, 2024

HideyoshiNakazone commented Feb 22, 2024 • edited

HideyoshiNakazone commented Apr 22, 2024

Uberi commented Apr 26, 2024

HideyoshiNakazone commented Apr 26, 2024

ftnext Apr 29, 2024

Choose a reason for hiding this comment

ftnext commented Apr 29, 2024

HideyoshiNakazone commented Feb 22, 2024 •

edited