Skip to content

Commit

Permalink
feat(speech): update the API
Browse files Browse the repository at this point in the history
#### speech:v1p1beta1

The following keys were added:
- schemas.RecognitionConfig.properties.enableSpokenEmojis.description
- schemas.RecognitionConfig.properties.enableSpokenEmojis.type
- schemas.RecognitionConfig.properties.enableSpokenPunctuation.description
- schemas.RecognitionConfig.properties.enableSpokenPunctuation.type

The following keys were changed:
- schemas.RecognitionConfig.properties.model.description

#### speech:v1

The following keys were changed:
- schemas.RecognitionConfig.properties.model.description
  • Loading branch information
yoshi-automation authored and sofisl committed Apr 20, 2021
1 parent 265463c commit 39db014
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 6 deletions.
4 changes: 2 additions & 2 deletions discovery/speech-v1.json
Expand Up @@ -212,7 +212,7 @@
}
}
},
"revision": "20210325",
"revision": "20210407",
"rootUrl": "https://speech.googleapis.com/",
"schemas": {
"ListOperationsResponse": {
Expand Down Expand Up @@ -403,7 +403,7 @@
"description": "Metadata regarding this request."
},
"model": {
"description": "Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate. ",
"description": "Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate. ",
"type": "string"
},
"profanityFilter": {
Expand Down
12 changes: 10 additions & 2 deletions discovery/speech-v1p1beta1.json
Expand Up @@ -524,7 +524,7 @@
}
}
},
"revision": "20210325",
"revision": "20210407",
"rootUrl": "https://speech.googleapis.com/",
"schemas": {
"ClassItem": {
Expand Down Expand Up @@ -854,6 +854,14 @@
"description": "If 'true', enables speaker detection for each recognized word in the top alternative of the recognition result using a speaker_tag provided in the WordInfo. Note: Use diarization_config instead.",
"type": "boolean"
},
"enableSpokenEmojis": {
"description": "The spoken emoji behavior for the call If not set, uses default behavior based on model of choice If 'true', adds spoken emoji formatting for the request. This will replace spoken emojis with the corresponding Unicode symbols in the final transcript. If 'false', spoken emojis are not replaced.",
"type": "boolean"
},
"enableSpokenPunctuation": {
"description": "The spoken punctuation behavior for the call If not set, uses default behavior based on model of choice e.g. command_and_search will enable spoken punctuation by default If 'true', replaces spoken punctuation with the corresponding symbols in the request. For example, \"how are you question mark\" becomes \"how are you?\". See https://cloud.google.com/speech-to-text/docs/spoken-punctuation for support. If 'false', spoken punctuation is not replaced.",
"type": "boolean"
},
"enableWordConfidence": {
"description": "If `true`, the top result includes a list of words and the confidence for those words. If `false`, no word-level confidence information is returned. The default is `false`.",
"type": "boolean"
Expand Down Expand Up @@ -902,7 +910,7 @@
"description": "Metadata regarding this request."
},
"model": {
"description": "Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate. ",
"description": "Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate. ",
"type": "string"
},
"profanityFilter": {
Expand Down
2 changes: 1 addition & 1 deletion src/apis/speech/v1.ts
Expand Up @@ -262,7 +262,7 @@ export namespace speech_v1 {
*/
metadata?: Schema$RecognitionMetadata;
/**
* Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate.
* Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate.
*/
model?: string | null;
/**
Expand Down
10 changes: 9 additions & 1 deletion src/apis/speech/v1p1beta1.ts
Expand Up @@ -387,6 +387,14 @@ export namespace speech_v1p1beta1 {
* If 'true', enables speaker detection for each recognized word in the top alternative of the recognition result using a speaker_tag provided in the WordInfo. Note: Use diarization_config instead.
*/
enableSpeakerDiarization?: boolean | null;
/**
* The spoken emoji behavior for the call If not set, uses default behavior based on model of choice If 'true', adds spoken emoji formatting for the request. This will replace spoken emojis with the corresponding Unicode symbols in the final transcript. If 'false', spoken emojis are not replaced.
*/
enableSpokenEmojis?: boolean | null;
/**
* The spoken punctuation behavior for the call If not set, uses default behavior based on model of choice e.g. command_and_search will enable spoken punctuation by default If 'true', replaces spoken punctuation with the corresponding symbols in the request. For example, "how are you question mark" becomes "how are you?". See https://cloud.google.com/speech-to-text/docs/spoken-punctuation for support. If 'false', spoken punctuation is not replaced.
*/
enableSpokenPunctuation?: boolean | null;
/**
* If `true`, the top result includes a list of words and the confidence for those words. If `false`, no word-level confidence information is returned. The default is `false`.
*/
Expand All @@ -412,7 +420,7 @@ export namespace speech_v1p1beta1 {
*/
metadata?: Schema$RecognitionMetadata;
/**
* Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate.
* Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig. *Model* *Description* command_and_search Best for short queries such as voice commands or voice search. phone_call Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate). video Best for audio that originated from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate. default Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate.
*/
model?: string | null;
/**
Expand Down

0 comments on commit 39db014

Please sign in to comment.