Using glossaries with tts and vision tutorial sample code [(#2325)](G…

…oogleCloudPlatform/python-docs-samples#2325) * fixing translate-with-glossary bug * initial commit * adding resources * adding more resources * glossary accomodates upper case words * finished hybrid glossaries tutorial sample code * Revert "fixing translate-with-glossary bug" This reverts commit 6a9f7ca3f68239a862106fcbcd9c73649ce36c77. * lint fix for tests. TODO src lint fix * lint * it's the final lint-down * adding README * implementing @nnegrey's feedback * lint * lint * extracting files from cloud-client * lint comment test * fixing comments per @beccasaurus * removing redundant directory * implementing @nnegrey's feedback * lint * lint * handling glossary-already-exists exception * lint * adding ssml functionality * fixing imports per @nnegrey * fixed import comment * more specific exceptions import * removing period from copyright
googleapis · Jul 31, 2020 · 1a161e7 · 1a161e7
1 parent dd8ef9e
commit 1a161e7
Show file tree

Hide file tree

Showing 12 changed files with 538 additions and 0 deletions.
diff --git a/samples/snippets/hybrid_glossaries/README.rst b/samples/snippets/hybrid_glossaries/README.rst
@@ -0,0 +1,97 @@
+.. This file is automatically generated. Do not edit this file directly.
+
+Google Translation API Python Samples
+===============================================================================
+
+.. image:: https://gstatic.com/cloudssh/images/open-btn.png
+   :target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/README.rst
+
+
+This directory contains samples for Google Translation API. With `Google Translation API`, you can dynamically translate text between thousands of language pairs.
+
+
+
+
+.. _Google Translation API: https://cloud.google.com/translate/docs
+
+Setup
+-------------------------------------------------------------------------------
+
+
+Authentication
+++++++++++++++
+
+This sample requires you to have authentication setup. Refer to the
+`Authentication Getting Started Guide`_ for instructions on setting up
+credentials for applications.
+
+.. _Authentication Getting Started Guide:
+    https://cloud.google.com/docs/authentication/getting-started
+
+Install Dependencies
+++++++++++++++++++++
+
+#. Clone python-docs-samples and change directory to the sample directory you want to use.
+
+    .. code-block:: bash
+
+        $ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
+
+#. Install `pip`_ and `virtualenv`_ if you do not already have them. You may want to refer to the `Python Development Environment Setup Guide`_ for Google Cloud Platform for instructions.
+
+   .. _Python Development Environment Setup Guide:
+       https://cloud.google.com/python/setup
+
+#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
+
+    .. code-block:: bash
+
+        $ virtualenv env
+        $ source env/bin/activate
+
+#. Install the dependencies needed to run the samples.
+
+    .. code-block:: bash
+
+        $ pip install -r requirements.txt
+
+.. _pip: https://pip.pypa.io/
+.. _virtualenv: https://virtualenv.pypa.io/
+
+Samples
+-------------------------------------------------------------------------------
+
+Using glossaries with vision and text-to-speech
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+.. image:: https://gstatic.com/cloudssh/images/open-btn.png
+   :target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/hybrid_tutorial.py,/README.rst
+
+
+
+
+To run this sample:
+
+.. code-block:: bash
+
+    $ python hybrid_tutorial.py
+
+
+
+
+The client library
+-------------------------------------------------------------------------------
+
+This sample uses the `Google Cloud Client Library for Python`_.
+You can read the documentation for more details on API usage and use GitHub
+to `browse the source`_ and  `report issues`_.
+
+.. _Google Cloud Client Library for Python:
+    https://googlecloudplatform.github.io/google-cloud-python/
+.. _browse the source:
+    https://github.com/GoogleCloudPlatform/google-cloud-python
+.. _report issues:
+    https://github.com/GoogleCloudPlatform/google-cloud-python/issues
+
+
+.. _Google Cloud SDK: https://cloud.google.com/sdk/
diff --git a/samples/snippets/hybrid_glossaries/README.rst.in b/samples/snippets/hybrid_glossaries/README.rst.in
@@ -0,0 +1,22 @@
+
+
+# This file is used to generate README.rst
+
+product:
+  name: Google Translation API
+  short_name: Translation API
+  url: https://cloud.google.com/translate/docs
+  description: >
+    With `Google Translation API`, you can dynamically translate text between
+    thousands of language pairs.
+
+setup:
+- auth
+- install_deps
+
+samples:
+- name: Using glossaries with vision and text-to-speech
+  file: hybrid_tutorial.py
+
+cloud_client_library: true
+
diff --git a/samples/snippets/hybrid_glossaries/hybrid_tutorial.py b/samples/snippets/hybrid_glossaries/hybrid_tutorial.py
@@ -0,0 +1,249 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+# [START translate_hybrid_imports]
+import io
+import os
+import html
+
+# Imports the Google Cloud client libraries
+from google.api_core.exceptions import AlreadyExists
+from google.cloud import translate_v3beta1 as translate
+from google.cloud import vision
+from google.cloud import texttospeech
+# [END translate_hybrid_imports]
+
+
+# [START translate_hybrid_project_id]
+# extract GCP project id
+PROJECT_ID = os.environ['GCLOUD_PROJECT']
+# [END translate_hybrid_project_id]
+
+
+# [START translate_hybrid_vision]
+def pic_to_text(infile):
+    """Detects text in an image file
+
+    ARGS
+    infile: path to image file
+
+    RETURNS
+    String of text detected in image
+    """
+
+    # Instantiates a client
+    client = vision.ImageAnnotatorClient()
+
+    # Opens the input image file
+    with io.open(infile, 'rb') as image_file:
+        content = image_file.read()
+
+    image = vision.types.Image(content=content)
+
+    # For dense text, use document_text_detection
+    # For less dense text, use text_detection
+    response = client.document_text_detection(image=image)
+    text = response.full_text_annotation.text
+
+    return text
+    # [END translate_hybrid_vision]
+
+
+# [START translate_hybrid_create_glossary]
+def create_glossary(languages, project_id, glossary_name, glossary_uri):
+    """Creates a GCP glossary resource
+    Assumes you've already manually uploaded a glossary to Cloud Storage
+
+    ARGS
+    languages: list of languages in the glossary
+    project_id: GCP project id
+    glossary_name: name you want to give this glossary resource
+    glossary_uri: the uri of the glossary you uploaded to Cloud Storage
+
+    RETURNS
+    nothing
+    """
+
+    # Instantiates a client
+    client = translate.TranslationServiceClient()
+
+    # Designates the data center location that you want to use
+    location = 'us-central1'
+
+    # Set glossary resource name
+    name = client.glossary_path(
+        project_id,
+        location,
+        glossary_name)
+
+    # Set language codes
+    language_codes_set = translate.types.Glossary.LanguageCodesSet(
+        language_codes=languages)
+
+    gcs_source = translate.types.GcsSource(
+        input_uri=glossary_uri)
+
+    input_config = translate.types.GlossaryInputConfig(
+        gcs_source=gcs_source)
+
+    # Set glossary resource information
+    glossary = translate.types.Glossary(
+        name=name,
+        language_codes_set=language_codes_set,
+        input_config=input_config)
+
+    parent = client.location_path(project_id, location)
+
+    # Create glossary resource
+    # Handle exception for case in which a glossary
+    #  with glossary_name already exists
+    try:
+        operation = client.create_glossary(parent=parent, glossary=glossary)
+        operation.result(timeout=90)
+        print('Created glossary ' + glossary_name + '.')
+    except AlreadyExists:
+        print('The glossary ' + glossary_name +
+              ' already exists. No new glossary was created.')
+    # [END translate_hybrid_create_glossary]
+
+
+# [START translate_hybrid_translate]
+def translate_text(text, source_language_code, target_language_code,
+                   project_id, glossary_name):
+    """Translates text to a given language using a glossary
+
+    ARGS
+    text: String of text to translate
+    prev_lang: language of input text
+    new_lang: language of output text
+    project_id: GCP project id
+    glossary_name: name you gave your project's glossary
+        resource when you created it
+
+    RETURNS
+    String of translated text
+    """
+
+    # Instantiates a client
+    client = translate.TranslationServiceClient()
+
+    # Designates the data center location that you want to use
+    location = 'us-central1'
+
+    glossary = client.glossary_path(
+        project_id,
+        location,
+        glossary_name)
+
+    glossary_config = translate.types.TranslateTextGlossaryConfig(
+        glossary=glossary)
+
+    parent = client.location_path(project_id, location)
+
+    result = client.translate_text(
+        parent=parent,
+        contents=[text],
+        mime_type='text/plain',  # mime types: text/plain, text/html
+        source_language_code=source_language_code,
+        target_language_code=target_language_code,
+        glossary_config=glossary_config)
+
+    # Extract translated text from API response
+    return result.glossary_translations[0].translated_text
+    # [END translate_hybrid_translate]
+
+
+# [START translate_hybrid_tts]
+def text_to_speech(text, outfile):
+    """Converts plaintext to SSML and
+    generates synthetic audio from SSML
+
+    ARGS
+    text: text to synthesize
+    outfile: filename to use to store synthetic audio
+
+    RETURNS
+    nothing
+    """
+
+    # Replace special characters with HTML Ampersand Character Codes
+    # These Codes prevent the API from confusing text with
+    # SSML commands
+    # For example, '<' --> '&lt;' and '&' --> '&amp;'
+    escaped_lines = html.escape(text)
+
+    # Convert plaintext to SSML in order to wait two seconds
+    #   between each line in synthetic speech
+    ssml = '<speak>{}</speak>'.format(
+        escaped_lines.replace('\n', '\n<break time="2s"/>'))
+
+    # Instantiates a client
+    client = texttospeech.TextToSpeechClient()
+
+    # Sets the text input to be synthesized
+    synthesis_input = texttospeech.types.SynthesisInput(ssml=ssml)
+
+    # Builds the voice request, selects the language code ("en-US") and
+    # the SSML voice gender ("MALE")
+    voice = texttospeech.types.VoiceSelectionParams(
+        language_code='en-US',
+        ssml_gender=texttospeech.enums.SsmlVoiceGender.MALE)
+
+    # Selects the type of audio file to return
+    audio_config = texttospeech.types.AudioConfig(
+        audio_encoding=texttospeech.enums.AudioEncoding.MP3)
+
+    # Performs the text-to-speech request on the text input with the selected
+    # voice parameters and audio file type
+    response = client.synthesize_speech(synthesis_input, voice, audio_config)
+
+    # Writes the synthetic audio to the output file.
+    with open(outfile, 'wb') as out:
+        out.write(response.audio_content)
+        print('Audio content written to file ' + outfile)
+    # [END translate_hybrid_tts]
+
+
+# [START translate_hybrid_integration]
+def main():
+
+    # Photo from which to extract text
+    infile = 'resources/example.png'
+    # Name of file that will hold synthetic speech
+    outfile = 'resources/example.mp3'
+
+    # Defines the languages in the glossary
+    # This list must match the languages in the glossary
+    #   Here, the glossary includes French and English
+    glossary_langs = ['fr', 'en']
+    # Name that will be assigned to your project's glossary resource
+    glossary_name = 'bistro-glossary'
+    # uri of .csv file uploaded to Cloud Storage
+    glossary_uri = 'gs://cloud-samples-data/translation/bistro_glossary.csv'
+
+    create_glossary(glossary_langs, PROJECT_ID,  glossary_name, glossary_uri)
+
+    # photo -> detected text
+    text_to_translate = pic_to_text(infile)
+    # detected text -> translated text
+    text_to_speak = translate_text(text_to_translate, 'fr', 'en',
+                                   PROJECT_ID, glossary_name)
+    # translated text -> synthetic audio
+    text_to_speech(text_to_speak, outfile)
+    # [END transalte_hybrid_integration]
+
+
+if __name__ == '__main__':
+    main()