Cloud image processing workflow:

Image archive, analysis, and report generation with Google Workspace (formerly G Suite) & GCP

In the intermediate codelab tutorial, developers build a cloud-based image processing workflow in Python along with Google Cloud REST APIs from GCP and Google Workspace (formerly G Suite). The exercise imagines an enterprise scenario where an organization can backup data (image files, for example) to the cloud, analyze them with machine learning, and report results formatted for consumption by management. This repo provides code solutions for each step through the tutorial plus alternate versions featuring other libraries and/or authorization schemes.

This is an intermediate codelab. If you're new to using Google APIs, specifically Google Workspace (formerly G Suite) and/or GCP APIs, we recommend completing the introductory codelabs (listed at the bottom of this page) first. You can read more about this code sample and codelab in this Google Developers blog post or this equivalent post on the Google Cloud blog.

Prerequisites

A Google account (Google Workspace/G Suite accounts may require administrator approval)
A Google Cloud Platform project with an active billing account
Familiarity with operating system terminal/shell commands
Basic skills in Python 2 or 3 (other languages supported)
Experience using Google APIs may be helpful but not required

NOTE for GCP developers: The codelab does not use GCP product client libraries nor service account authorization — instead it uses the lower-level platform client libraries (because non-Cloud APIs don't have product libraries yet) and user account authorization (because the target file starts out in Google Drive). However, solutions featuring GCP product client libraries as well as service accounts are available as alternatives in the alt folder.

Description

The primary objective is to analyze Google Workspace images... everything else (archiving, report generation) is a bonus. It starts with the image file on Google Drive, archives it to Google Cloud Storage, analyzes it with Cloud Vision, and writes a "results" row into a Google Sheet. Each step of the tutorial builds successively on the previous step, adding one feature at a time. Each of the step* directories represent the state the application should be in upon successful completion of that corresponding step in the codelab, culminating with a refactor step to arrive at the final version.

Download image from Google Drive The first step utilizes the Google Drive API to search for the image file and downloads the first match. Along with the filename and binary payload, the file's MIMEtype, last modification timestamp, and size in bytes are also returned.
Backup image to Google Cloud Storage The next step is to upload the image as a "blob" object to Google Cloud Storage (GCS), performing an "insert" to the given bucket. Once data is in GCS, it can then be used by other GCP tools. GCS also supports cheaper, "colder" storage, meaning the less often you access objects, the lower the cost, as described on the storage class page. NOTE: "/" in GCS filenames is merely a visual cue as GCS doesn't support "folders." Our solution features an optional PARENT folder to help organize images in the destination bucket. (The GCP client libraries prep the data for GCS, so we need the platform client library MediaIoBaseUpload convenience object to help with the upload using the platform library.)
Send image to Cloud Vision for analysis Since we have the image binary data, let's also send it to Cloud Vision for analysis. Using its API, request object detection/identification (called label annotation), but ask only for the top 5 labels for a faster response. Each label returned includes a confidence score the label applies to the image.
Add results to Google Sheets The last new feature is report generation: add a spreadsheet row to visualize results via the Google Sheets API. The row includes the Cloud Vision output and the file's GCS archive hyperlinked location.
*Refactor The final, yet optional, step involves refactoring with best practices: move the "main" body into a separate function and supporting command-line options to provide user flexibility.

Authorization scheme and alternative versions

We've selected to use user account authorization (instead of service account authorization), platform client libraries (instead of product client libraries since those aren't available for Google Workspace (formerly G Suite) APIs), and older auth libraries for readability, consistency, greater Python 2-3 compatibility, and automated OAuth2 token management. This provides what we hope is the least complex user experience. Alternative versions (of the final application) using service accounts, product client libraries, and newer currently-supported auth libraries, are found in the alt subdirectory. See its README for more information.

Summary and further study

The goal of the codelab sample app is to help developers envision possible business scenarios. A secondary goal is showing how to use GCP and Google Workspace (formerly G Suite) APIs together for one solution. Problems with either the codelab or code in this repo? File an issue (do a search first).

References

Codelabs
- Intro to Workspace APIs (Google Drive API) (Python)
- Using Cloud Vision with Python (Python)
- Build customized reporting tools (Google Sheets API) (JS/Node)
- Upload objects to Google Cloud Storage (no coding required)
General
- Google APIs client library for Python
- Google APIs client libraries
Google Workspace
Google Cloud Platform (GCP)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
alt		alt
final		final
step1-drive		step1-drive
step2-gcs		step2-gcs
step3-vision		step3-vision
step4-sheets		step4-sheets
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

alt

alt

final

final

step1-drive

step1-drive

step2-gcs

step2-gcs

step3-vision

step3-vision

step4-sheets

step4-sheets

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Cloud image processing workflow:

Image archive, analysis, and report generation with Google Workspace (formerly G Suite) & GCP

Prerequisites

Description

Authorization scheme and alternative versions

Summary and further study

References

About

Releases

Packages

Languages

License

googlecodelabs/analyze_gsimg

Folders and files

Latest commit

History

Repository files navigation

Cloud image processing workflow:

Image archive, analysis, and report generation with Google Workspace (formerly G Suite) & GCP

Prerequisites

Description

Authorization scheme and alternative versions

Summary and further study

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages