From 86b005403ba677c15c90cbd52b960ee24e3f063a Mon Sep 17 00:00:00 2001 From: Sita Lakshmi Sangameswaran Date: Wed, 1 Mar 2023 23:56:19 +0530 Subject: [PATCH] migrate code from googleapis/python-dlp (#9091) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * chore: move samples from python-docs-sample (#33) * Add samples for DLP API v2beta1 [(#1369)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1369) * Auto-update dependencies. [(#1377)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1377) * Auto-update dependencies. * Update requirements.txt * Update DLP samples for release [(#1415)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1415) * fix DLP region tags, and add @flaky to pub/sub sample tests [(#1418)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1418) * Auto-update dependencies. * Regenerate the README files and fix the Open in Cloud Shell link for some samples [(#1441)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1441) * Update README for DLP GA [(#1426)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1426) * Update READMEs to fix numbering and add git clone [(#1464)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1464) * DLP: Add auto_populate_timespan option for create job trigger. [(#1543)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1543) * Add DLP code samples for custom info types [(#1524)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1524) * Add custom info type samples to inspect_content.py Use flags to indicate dictionary word lists and regex patterns, then parse them into custom info types. * Make code compatible with python 2.7 * Add missing commas * Remove bad import * Add tests for custom info types * Add info_types parameter to deid.py * Update deid tests to use info_types parameter * Fix indentation * Add blank lines * Share logic for building custom info types * Fix line too long * Fix typo. * Revert "Fix typo." This reverts commit b4ffea6eef1fc2ccd2a4f17adb6e9492e54f1b76, so that the sharing of the custom info type logic can be reverted as well to make the code samples more readable. * Revert "Share logic for building custom info types" This reverts commit 47fc04f74c77db3bd5397459cf9242dc11521c37. This makes the code samples more readable. * Switch from indexes to using enumerate. * Updated help message for custom dictionaries. * Fix enumerate syntax error. * upgrade DLP version and fix tests [(#1784)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1784) * upgrade DLP version and fix tests * bump dlp version again * Auto-update dependencies. [(#1846)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1846) ACK, merging. * Per internal documentation complaint, fix the naming. [(#1933)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1933) The documentation for DLP uses 'dlp' as the instance name. As this is also the name of the python package, it could be confusing for people new to the API object model so switch to dlp_client. * Add inspect table code sample for DLP and some nit fixes [(#1921)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1921) * Remove claim that redact.py operates on strings Reflect in the comments that this particular code sample does not support text redaction. * Add code sample for inspecting table, fix requirements for running tests, quickstart example refactor * Remove newline, if -> elif * formatting * More formatting * Update DLP redact image code sample region to include mimetype import [(#1928)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1928) In response to feedback where a user was confused that the mimetype import was missing from the code sample in the documentation. * Update to use new subscribe() syntax [(#1989)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1989) * Update to use new subscribe() syntax * Missed two subscribe() call changes before * Cancel subscription when processed * Update risk.py * Fix waiting for message * Unneeded try/except removed * Auto-update dependencies. [(#1980)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/1980) * Auto-update dependencies. * Update requirements.txt * Update requirements.txt * Convert append -> nargs, so arguments are not additive [(#2191)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2191) * increase test timeout [(#2351)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2351) * Adds updates including compute [(#2436)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2436) * Adds updates including compute * Python 2 compat pytest * Fixing weird \r\n issue from GH merge * Put asset tests back in * Re-add pod operator test * Hack parameter for k8s pod operator * Update DLP samples to use dlp_v2 client. [(#2580)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2580) * fix: correct dataset name, use env var for project [(#2621)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2621) * fix: correct dataset name, use env var for project * Add uuids to tests * add uuids and fixtures for bq * Add logic to delete job * ran black * Run black with line length * Add utf encoding for python 2 tests * Add skips for now * Ran black * Remove skips, adjust job tests * fix lint and skips * Cleanup commented things Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com> * Remove param to reduce latency (per docs) [(#2853)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2853) * chore(deps): update dependency google-cloud-storage to v1.26.0 [(#3046)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3046) * chore(deps): update dependency google-cloud-storage to v1.26.0 * chore(deps): specify dependencies by python version * chore: up other deps to try to remove errors Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com> Co-authored-by: Leah Cole * Fix dlp tests [(#3058)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3058) Since the tests are flaky and timing out, I'm proposing we do the ML API approach of creating an operation then canceling it. It would fix #2809 fix #2810 fix #2811 fix #2812 * Simplify noxfile setup. [(#2806)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/2806) * chore(deps): update dependency requests to v2.23.0 * Simplify noxfile and add version control. * Configure appengine/standard to only test Python 2.7. * Update Kokokro configs to match noxfile. * Add requirements-test to each folder. * Remove Py2 versions from everything execept appengine/standard. * Remove conftest.py. * Remove appengine/standard/conftest.py * Remove 'no-sucess-flaky-report' from pytest.ini. * Add GAE SDK back to appengine/standard tests. * Fix typo. * Roll pytest to python 2 version. * Add a bunch of testing requirements. * Remove typo. * Add appengine lib directory back in. * Add some additional requirements. * Fix issue with flake8 args. * Even more requirements. * Readd appengine conftest.py. * Add a few more requirements. * Even more Appengine requirements. * Add webtest for appengine/standard/mailgun. * Add some additional requirements. * Add workaround for issue with mailjet-rest. * Add responses for appengine/standard/mailjet. Co-authored-by: Renovate Bot * [dlp] fix: fix periodic builds timeout [(#3420)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3420) * [dlp] fix: remove gcp-devrel-py-tools fixes #3375 fixes #3416 fixes #3417 * remove wrong usage of `eventually_consistent.call` * only test if the operation has been started * shorter timeout for polling * correct use of `pytest.mark.flaky` * use try-finally * use uuid for job_id * add a filter to allow state = DONE * chore(deps): update dependency google-cloud-dlp to v0.14.0 [(#3431)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3431) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-dlp](https://togithub.com/googleapis/python-dlp) | minor | `==0.13.0` -> `==0.14.0` | --- ### Release Notes
googleapis/python-dlp ### [`v0.14.0`](https://togithub.com/googleapis/python-dlp/blob/master/CHANGELOG.md#​0140-httpswwwgithubcomgoogleapispython-dlpcomparev0130v0140-2020-02-21) [Compare Source](https://togithub.com/googleapis/python-dlp/compare/v0.13.0...v0.14.0) ##### Features - **dlp:** undeprecate resource name helper methods, add 2.7 deprecation warning (via synth) ([#​10040](https://www.github.com/googleapis/python-dlp/issues/10040)) ([b30d7c1](https://www.github.com/googleapis/python-dlp/commit/b30d7c1cd48fba47fdddb7b9232e421261108a52))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [x] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples). * Update dependency google-cloud-datastore to v1.12.0 [(#3296)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3296) Co-authored-by: gcf-merge-on-green[bot] <60162190+gcf-merge-on-green[bot]@users.noreply.github.com> * Update dependency google-cloud-pubsub to v1.4.2 [(#3340)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3340) Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com> * chore(deps): update dependency google-cloud-storage to v1.28.0 [(#3260)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3260) Co-authored-by: Takashi Matsuo * [dlp] fix: increase the number of retries for some tests [(#3685)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3685) fixes #3673 * chore: some lint fixes [(#3744)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3744) * chore(deps): update dependency google-cloud-pubsub to v1.4.3 [(#3725)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3725) Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: Takashi Matsuo * chore(deps): update dependency google-cloud-dlp to v0.15.0 [(#3780)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3780) * chore(deps): update dependency google-cloud-storage to v1.28.1 [(#3785)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3785) * chore(deps): update dependency google-cloud-storage to v1.28.1 * [asset] testing: use uuid instead of time Co-authored-by: Takashi Matsuo * chore(deps): update dependency google-cloud-pubsub to v1.5.0 [(#3781)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3781) Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> * [dlp] fix: mitigate flakiness [(#3919)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3919) * [dlp] fix: mitigate flakiness * make the Pub/Sub fixture function level * shorten the timeout for the tests from 300 secs to 30 secs * retring all the tests in risk_test.py 3 times fixes #3897 fixes #3896 fixes #3895 fixes #3894 fixes #3893 fixes #3892 fixes #3890 fixes #3889 * more retries, comment * 30 seconds operation wait and 20 minutes retry delay * lint fix etc * limit the max retry wait time * [dlp] testing: fix Pub/Sub notifications [(#3925)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3925) * re-generated README.rst with some more setup info * use parent with the global location attached * re-enabled some tests with Pub/Sub notification * stop waiting between test retries * Add text redaction sample using DLP [(#3964)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3964) * Add text redaction sample using DLP * Update dlp/deid.py Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> * Rename string parameter to item Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> * testing: start using btlr [(#3959)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3959) * testing: start using btlr The binary is at gs://cloud-devrel-kokoro-resources/btlr/v0.0.1/btlr * add period after DIFF_FROM * use array for btlr args * fix websocket tests * add debug message * wait longer for the server to spin up * dlp: bump the wait timeout to 10 minutes * [run] copy noxfile.py to child directory to avoid gcloud issue * [iam] fix: only display description when the key exists * use uuid4 instead of uuid1 * [iot] testing: use the same format for registry id * Stop asserting Out of memory not in the output * fix missing imports * [dns] testing: more retries with delay * [dlp] testing: longer timeout * use the max-concurrency flag * use 30 workers * [monitoring] use multiple projects * [dlp] testing: longer timeout * Add code sample for string replacement based deidentification. [(#3956)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3956) Adds a code sample corresponding to the replacement based deidentification in the Cloud DLP API. The detected sensitive value is replaced with a specified surrogate. * Add custom infoType snippets to DLP samples [(#3991)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/3991) * Replace GCLOUD_PROJECT with GOOGLE_CLOUD_PROJECT. [(#4022)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4022) * Rename DLP code samples from 'redact' to 'replace' [(#4020)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4020) In the DLP API, redaction and replacement are two separate, named concepts. Code samples recently added by #3964 were named 'redact' but are actually examples of replacement. This change renames those samples for clarity. * Add DLP sample for redacting all image text [(#4018)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4018) The sample shows how to remove all text found in an image with DLP. The sample is integrated into the existing redact.py CLI application. * Add DLP sample code for inspecting with custom regex detector [(#4031)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4031) * code sample and test for medical record number custom regex detector * fix linter error * Using f-strings instead of string.format Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: gcf-merge-on-green[bot] <60162190+gcf-merge-on-green[bot]@users.noreply.github.com> * Update dependency google-cloud-dlp to v1 [(#4047)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4047) * Update dependency google-cloud-bigquery to v1.25.0 [(#4024)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4024) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | minor | `==1.24.0` -> `==1.25.0` | --- ### Release Notes
googleapis/python-bigquery ### [`v1.25.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​1250-httpswwwgithubcomgoogleapispython-bigquerycomparev1240v1250-2020-06-06) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v1.24.0...v1.25.0) ##### Features - add BigQuery storage client support to DB API ([#​36](https://www.github.com/googleapis/python-bigquery/issues/36)) ([ba9b2f8](https://www.github.com/googleapis/python-bigquery/commit/ba9b2f87e36320d80f6f6460b77e6daddb0fa214)) - **bigquery:** add create job method ([#​32](https://www.github.com/googleapis/python-bigquery/issues/32)) ([2abdef8](https://www.github.com/googleapis/python-bigquery/commit/2abdef82bed31601d1ca1aa92a10fea1e09f5297)) - **bigquery:** add support of model for extract job ([#​71](https://www.github.com/googleapis/python-bigquery/issues/71)) ([4a7a514](https://www.github.com/googleapis/python-bigquery/commit/4a7a514659a9f6f9bbd8af46bab3f8782d6b4b98)) - add HOUR support for time partitioning interval ([#​91](https://www.github.com/googleapis/python-bigquery/issues/91)) ([0dd90b9](https://www.github.com/googleapis/python-bigquery/commit/0dd90b90e3714c1d18f8a404917a9454870e338a)) - add support for policy tags ([#​77](https://www.github.com/googleapis/python-bigquery/issues/77)) ([38a5c01](https://www.github.com/googleapis/python-bigquery/commit/38a5c01ca830daf165592357c45f2fb4016aad23)) - make AccessEntry objects hashable ([#​93](https://www.github.com/googleapis/python-bigquery/issues/93)) ([23a173b](https://www.github.com/googleapis/python-bigquery/commit/23a173bc5a25c0c8200adc5af62eb05624c9099e)) - **bigquery:** expose start index parameter for query result ([#​121](https://www.github.com/googleapis/python-bigquery/issues/121)) ([be86de3](https://www.github.com/googleapis/python-bigquery/commit/be86de330a3c3801653a0ccef90e3d9bdb3eee7a)) - **bigquery:** unit and system test for dataframe with int column with Nan values ([#​39](https://www.github.com/googleapis/python-bigquery/issues/39)) ([5fd840e](https://www.github.com/googleapis/python-bigquery/commit/5fd840e9d4c592c4f736f2fd4792c9670ba6795e)) ##### Bug Fixes - allow partial streaming_buffer statistics ([#​37](https://www.github.com/googleapis/python-bigquery/issues/37)) ([645f0fd](https://www.github.com/googleapis/python-bigquery/commit/645f0fdb35ee0e81ee70f7459e796a42a1f03210)) - distinguish server timeouts from transport timeouts ([#​43](https://www.github.com/googleapis/python-bigquery/issues/43)) ([a17be5f](https://www.github.com/googleapis/python-bigquery/commit/a17be5f01043f32d9fbfb2ddf456031ea9205c8f)) - improve cell magic error message on missing query ([#​58](https://www.github.com/googleapis/python-bigquery/issues/58)) ([6182cf4](https://www.github.com/googleapis/python-bigquery/commit/6182cf48aef8f463bb96891cfc44a96768121dbc)) - **bigquery:** fix repr of model reference ([#​66](https://www.github.com/googleapis/python-bigquery/issues/66)) ([26c6204](https://www.github.com/googleapis/python-bigquery/commit/26c62046f4ec8880cf6561cc90a8b821dcc84ec5)) - **bigquery:** fix start index with page size for list rows ([#​27](https://www.github.com/googleapis/python-bigquery/issues/27)) ([400673b](https://www.github.com/googleapis/python-bigquery/commit/400673b5d0f2a6a3d828fdaad9d222ca967ffeff))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples). * Add code sample and tests for redaction [(#4037)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4037) Add A DLP code sample for redacting text. Code will be linked to this documentation: https://cloud.google.com/dlp/docs/deidentify-sensitive-data * dlp: add inspect string sample, person_name w/ custom hotword certainty boosting [(#4081)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4081) * Add a simplified inspect string example to DLP code samples [(#4069)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4069) * Add a simplified inspect string example * Remove unnecessary try-catch block - all findings in this examnple should have quotes. * dlp: Add sample for reid w/ fpe using surrogate type and unwrapped security key [(#4051)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4051) * add code sample and test for reid w/ fpe using surrogate type and unwrapped security key * refactor reidentify_config * add code sample and test for medical number custom detector with hotwords [(#4071)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4071) Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com> * Add DLP code sample and test for de-id free text with surrogate [(#4085)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4085) ## Description Add DLP code sample and test for de-id free text with surrogate, meant for https://cloud.google.com/dlp/docs/pseudonymization#de-identification_in_free_text_code_example ## Checklist - [x] I have followed [Sample Guidelines from AUTHORING_GUIDE.MD](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/AUTHORING_GUIDE.md) - [ ] README is updated to include [all relevant information](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/AUTHORING_GUIDE.md#readme-file) - [x] **Tests** pass: `nox -s py-3.6` (see [Test Enviroment Setup](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/AUTHORING_GUIDE.md#test-environment-setup)) - [x] **Lint** pass: `nox -s lint` (see [Test Enviroment Setup](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/AUTHORING_GUIDE.md#test-environment-setup)) - [ ] These samples need a new **API enabled** in testing projects to pass (let us know which ones) - [ ] These samples need a new/updated **env vars** in testing projects set to pass (let us know which ones) - [x] Please **merge** this PR for me once it is approved. * chore(deps): update dependency google-cloud-storage to v1.29.0 [(#4040)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4040) * Update dependency google-cloud-pubsub to v1.6.0 [(#4039)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4039) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-pubsub](https://togithub.com/googleapis/python-pubsub) | minor | `==1.5.0` -> `==1.6.0` | --- ### Release Notes
googleapis/python-pubsub ### [`v1.6.0`](https://togithub.com/googleapis/python-pubsub/blob/master/CHANGELOG.md#​160-httpswwwgithubcomgoogleapispython-pubsubcomparev150v160-2020-06-09) [Compare Source](https://togithub.com/googleapis/python-pubsub/compare/v1.5.0...v1.6.0) ##### Features - Add flow control for message publishing ([#​96](https://www.github.com/googleapis/python-pubsub/issues/96)) ([06085c4](https://www.github.com/googleapis/python-pubsub/commit/06085c4083b9dccdd50383257799904510bbf3a0)) ##### Bug Fixes - Fix PubSub incompatibility with api-core 1.17.0+ ([#​103](https://www.github.com/googleapis/python-pubsub/issues/103)) ([c02060f](https://www.github.com/googleapis/python-pubsub/commit/c02060fbbe6e2ca4664bee08d2de10665d41dc0b)) ##### Documentation - Clarify that Schedulers shouldn't be used with multiple SubscriberClients ([#​100](https://togithub.com/googleapis/python-pubsub/pull/100)) ([cf9e87c](https://togithub.com/googleapis/python-pubsub/commit/cf9e87c80c0771f3fa6ef784a8d76cb760ad37ef)) - Fix update subscription/snapshot/topic samples ([#​113](https://togithub.com/googleapis/python-pubsub/pull/113)) ([e62c38b](https://togithub.com/googleapis/python-pubsub/commit/e62c38bb33de2434e32f866979de769382dea34a)) ##### Internal / Testing Changes - Re-generated service implementaton using synth: removed experimental notes from the RetryPolicy and filtering features in anticipation of GA, added DetachSubscription (experimental) ([#​114](https://togithub.com/googleapis/python-pubsub/pull/114)) ([0132a46](https://togithub.com/googleapis/python-pubsub/commit/0132a4680e0727ce45d5e27d98ffc9f3541a0962)) - Incorporate will_accept() checks into publish() ([#​108](https://togithub.com/googleapis/python-pubsub/pull/108)) ([6c7677e](https://togithub.com/googleapis/python-pubsub/commit/6c7677ecb259672bbb9b6f7646919e602c698570))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples). * [dlp] fix: add retry count to mitigate the flake [(#4152)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4152) fixes #4100 * chore(deps): update dependency google-cloud-pubsub to v1.6.1 [(#4242)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4242) Co-authored-by: gcf-merge-on-green[bot] <60162190+gcf-merge-on-green[bot]@users.noreply.github.com> * chore(deps): update dependency google-cloud-datastore to v1.13.0 [(#4273)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4273) * chore(deps): update dependency pytest to v5.4.3 [(#4279)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4279) * chore(deps): update dependency pytest to v5.4.3 * specify pytest for python 2 in appengine Co-authored-by: Leah Cole * chore(deps): update dependency mock to v4 [(#4287)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4287) * chore(deps): update dependency mock to v4 * specify mock version for appengine python 2 Co-authored-by: Leah Cole * chore(deps): update dependency google-cloud-pubsub to v1.7.0 [(#4290)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4290) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-pubsub](https://togithub.com/googleapis/python-pubsub) | minor | `==1.6.1` -> `==1.7.0` | --- ### Release Notes
googleapis/python-pubsub ### [`v1.7.0`](https://togithub.com/googleapis/python-pubsub/blob/master/CHANGELOG.md#​170-httpswwwgithubcomgoogleapispython-pubsubcomparev161v170-2020-07-13) [Compare Source](https://togithub.com/googleapis/python-pubsub/compare/v1.6.1...v1.7.0) ##### New Features - Add support for server-side flow control. ([#​143](https://togithub.com/googleapis/python-pubsub/pull/143)) ([04e261c](https://www.github.com/googleapis/python-pubsub/commit/04e261c602a2919cc75b3efa3dab099fb2cf704c)) ##### Dependencies - Update samples dependency `google-cloud-pubsub` to `v1.6.1`. ([#​144](https://togithub.com/googleapis/python-pubsub/pull/144)) ([1cb6746](https://togithub.com/googleapis/python-pubsub/commit/1cb6746b00ebb23dbf1663bae301b32c3fc65a88)) ##### Documentation - Add pubsub/cloud-client samples from the common samples repo (with commit history). ([#​151](https://togithub.com/googleapis/python-pubsub/pull/151)) - Add flow control section to publish overview. ([#​129](https://togithub.com/googleapis/python-pubsub/pull/129)) ([acc19eb](https://www.github.com/googleapis/python-pubsub/commit/acc19eb048eef067d9818ef3e310b165d9c6307e)) - Add a link to Pub/Sub filtering language public documentation to `pubsub.proto`. ([#​121](https://togithub.com/googleapis/python-pubsub/pull/121)) ([8802d81](https://www.github.com/googleapis/python-pubsub/commit/8802d8126247f22e26057e68a42f5b5a82dcbf0d))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples). * Update dependency flaky to v3.7.0 [(#4300)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4300) * Update dependency google-cloud-datastore to v1.13.1 [(#4295)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4295) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-datastore](https://togithub.com/googleapis/python-datastore) | patch | `==1.13.0` -> `==1.13.1` | --- ### Release Notes
googleapis/python-datastore ### [`v1.13.1`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​1131-httpswwwgithubcomgoogleapispython-datastorecomparev1130v1131-2020-07-13) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v1.13.0...v1.13.1)
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples). * chore(deps): update dependency google-cloud-datastore to v1.13.2 [(#4326)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4326) * Update dependency google-cloud-storage to v1.30.0 * Update dependency pytest to v6 [(#4390)](https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4390) * chore: update templates * chore: update synth.py * chore: update project env name Co-authored-by: Andrew Gorcester Co-authored-by: DPE bot Co-authored-by: chenyumic Co-authored-by: Frank Natividad Co-authored-by: Mike DaCosta Co-authored-by: michaelawyu Co-authored-by: mwdaub Co-authored-by: realjordanna <32629229+realjordanna@users.noreply.github.com> Co-authored-by: Ace Co-authored-by: djmailhot Co-authored-by: Charles Engelke Co-authored-by: Maximus Co-authored-by: Averi Kitsch Co-authored-by: Gus Class Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com> Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com> Co-authored-by: WhiteSource Renovate Co-authored-by: Leah Cole Co-authored-by: Takashi Matsuo Co-authored-by: gcf-merge-on-green[bot] <60162190+gcf-merge-on-green[bot]@users.noreply.github.com> Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: Seth Moore Co-authored-by: Ace Co-authored-by: Seth Moore Co-authored-by: jlmwise <66651702+jlmwise@users.noreply.github.com> Co-authored-by: Xiaohua (Victor) Liang Co-authored-by: Xiaohua (Victor) Liang Co-authored-by: Charles Engelke * feat!: migrate to use microgen (#34) * feat!: migrate to use microgen * Update UPGRADING.md Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> * chore(deps): update dependency google-cloud-dlp to v2 (#43) * chore(deps): update dependency google-cloud-bigquery to v1.27.2 (#36) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | minor | `==1.25.0` -> `==1.27.2` | --- ### Release Notes
googleapis/python-bigquery ### [`v1.27.2`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​1272-httpswwwgithubcomgoogleapispython-bigquerycomparev1271v1272-2020-08-18) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v1.26.1...v1.27.2) ### [`v1.26.1`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​1261-httpswwwgithubcomgoogleapispython-bigquerycomparev1260v1261-2020-07-25) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v1.26.0...v1.26.1) ### [`v1.26.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​1260-httpswwwgithubcomgoogleapispython-bigquerycomparev1250v1260-2020-07-20) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v1.25.0...v1.26.0) ##### Features - use BigQuery Storage client by default (if dependencies available) ([#​55](https://www.github.com/googleapis/python-bigquery/issues/55)) ([e75ff82](https://www.github.com/googleapis/python-bigquery/commit/e75ff8297c65981545b097f75a17cf9e78ac6772)), closes [#​91](https://www.github.com/googleapis/python-bigquery/issues/91) - **bigquery:** add **eq** method for class PartitionRange and RangePartitioning ([#​162](https://www.github.com/googleapis/python-bigquery/issues/162)) ([0d2a88d](https://www.github.com/googleapis/python-bigquery/commit/0d2a88d8072154cfc9152afd6d26a60ddcdfbc73)) - **bigquery:** expose date_as_object parameter to users ([#​150](https://www.github.com/googleapis/python-bigquery/issues/150)) ([a2d5ce9](https://www.github.com/googleapis/python-bigquery/commit/a2d5ce9e97992318d7dc85c51c053cab74e25a11)) - **bigquery:** expose date_as_object parameter to users ([#​150](https://www.github.com/googleapis/python-bigquery/issues/150)) ([cbd831e](https://www.github.com/googleapis/python-bigquery/commit/cbd831e08024a67148723afd49e1db085e0a862c)) ##### Bug Fixes - dry run queries with DB API cursor ([#​128](https://www.github.com/googleapis/python-bigquery/issues/128)) ([bc33a67](https://www.github.com/googleapis/python-bigquery/commit/bc33a678a765f0232615aa2038b8cc67c88468a0)) - omit `NaN` values when uploading from `insert_rows_from_dataframe` ([#​170](https://www.github.com/googleapis/python-bigquery/issues/170)) ([f9f2f45](https://www.github.com/googleapis/python-bigquery/commit/f9f2f45bc009c03cd257441bd4b6beb1754e2177)) ##### Documentation - **bigquery:** add client thread-safety documentation ([#​132](https://www.github.com/googleapis/python-bigquery/issues/132)) ([fce76b3](https://www.github.com/googleapis/python-bigquery/commit/fce76b3776472b1da798df862a3405e659e35bab)) - **bigquery:** add docstring for conflict exception ([#​171](https://www.github.com/googleapis/python-bigquery/issues/171)) ([9c3409b](https://www.github.com/googleapis/python-bigquery/commit/9c3409bb06218bf499620544f8e92802df0cce47)) - **bigquery:** consistent use of optional keyword ([#​153](https://www.github.com/googleapis/python-bigquery/issues/153)) ([79d8c61](https://www.github.com/googleapis/python-bigquery/commit/79d8c61064cca18b596a24b6f738c7611721dd5c)) - **bigquery:** fix the broken docs ([#​139](https://www.github.com/googleapis/python-bigquery/issues/139)) ([3235255](https://www.github.com/googleapis/python-bigquery/commit/3235255cc5f483949f34d2e8ef13b372e8713782))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-datastore to v1.15.0 (#42) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-datastore](https://togithub.com/googleapis/python-datastore) | minor | `==1.13.2` -> `==1.15.0` | --- ### Release Notes
googleapis/python-datastore ### [`v1.15.0`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​1150-httpswwwgithubcomgoogleapispython-datastorecomparev1140v1150-2020-08-14) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v1.14.0...v1.15.0) ##### Features - add retry and timeout args to API methods ([#​67](https://www.github.com/googleapis/python-datastore/issues/67)) ([f3283e1](https://www.github.com/googleapis/python-datastore/commit/f3283e14c34c36c8386e4cb6b43c109d469f118c)), closes [#​3](https://www.github.com/googleapis/python-datastore/issues/3) - supply anonymous credentials under emulator ([#​71](https://www.github.com/googleapis/python-datastore/issues/71)) ([4db3c40](https://www.github.com/googleapis/python-datastore/commit/4db3c4048e53c220eee0aea2063c05292bbc5334)), closes [#​70](https://www.github.com/googleapis/python-datastore/issues/70) ##### Bug Fixes - smooth over system test bumps ([#​66](https://www.github.com/googleapis/python-datastore/issues/66)) ([8bb17ea](https://www.github.com/googleapis/python-datastore/commit/8bb17ea30ed94c0a298a54cc75c031b67d0a576a)) ##### Documentation - add docs for admin client ([#​63](https://www.github.com/googleapis/python-datastore/issues/63)) ([43ff64a](https://www.github.com/googleapis/python-datastore/commit/43ff64a5889aeac321fbead967ec527ede414fa2)), closes [#​49](https://www.github.com/googleapis/python-datastore/issues/49) ### [`v1.14.0`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​1140-httpswwwgithubcomgoogleapispython-datastorecomparev1132v1140-2020-08-05) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v1.13.2...v1.14.0) ##### Features - pass 'client_options' to base class ctor ([#​60](https://www.github.com/googleapis/python-datastore/issues/60)) ([2575697](https://www.github.com/googleapis/python-datastore/commit/2575697380a2e57b210a37033f2558de582ec10e)), closes [#​50](https://www.github.com/googleapis/python-datastore/issues/50) ##### Documentation - correct semantics of 'complete_key' arg to 'Client.reserve_ids' ([#​36](https://www.github.com/googleapis/python-datastore/issues/36)) ([50ed945](https://www.github.com/googleapis/python-datastore/commit/50ed94503da244434df0be58098a0ccf2da54b16)) - update docs build (via synth) ([#​58](https://www.github.com/googleapis/python-datastore/issues/58)) ([5bdacd4](https://www.github.com/googleapis/python-datastore/commit/5bdacd4785f3d433e6e7302fc6839a3c5a3314b4)), closes [#​700](https://www.github.com/googleapis/python-datastore/issues/700) ##### [1.13.2](https://www.github.com/googleapis/python-datastore/compare/v1.13.1...v1.13.2) (2020-07-17) ##### Bug Fixes - modify admin pkg name in gapic ([#​47](https://www.github.com/googleapis/python-datastore/issues/47)) ([5b5011d](https://www.github.com/googleapis/python-datastore/commit/5b5011daf74133ecdd579bf19bbcf356e6f40dad)) ##### [1.13.1](https://www.github.com/googleapis/python-datastore/compare/v1.13.0...v1.13.1) (2020-07-13) ##### Bug Fixes - add missing datastore admin client files ([#​43](https://www.github.com/googleapis/python-datastore/issues/43)) ([0d40f87](https://www.github.com/googleapis/python-datastore/commit/0d40f87eeacd2a256d4b45ccb742599b5df93096))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-storage to v1.31.2 (#44) * chore(deps): update dependency google-cloud-bigquery to v2 (#62) * fix(sample-test): add backoff around the cleanup code (#65) * chore(deps): update dependency google-cloud-datastore to v1.15.3 (#59) * chore(deps): update dependency google-cloud-storage to v1.32.0 (#67) * chore(deps): update dependency google-cloud-bigquery to v2.2.0 (#66) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | minor | `==2.1.0` -> `==2.2.0` | --- ### Release Notes
googleapis/python-bigquery ### [`v2.2.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​220-httpswwwgithubcomgoogleapispython-bigquerycomparev210v220-2020-10-19) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.1.0...v2.2.0) ##### Features - add method api_repr for table list item ([#​299](https://www.github.com/googleapis/python-bigquery/issues/299)) ([07c70f0](https://www.github.com/googleapis/python-bigquery/commit/07c70f0292f9212f0c968cd5c9206e8b0409c0da)) - add support for listing arima, automl, boosted tree, DNN, and matrix factorization models ([#​328](https://www.github.com/googleapis/python-bigquery/issues/328)) ([502a092](https://www.github.com/googleapis/python-bigquery/commit/502a0926018abf058cb84bd18043c25eba15a2cc)) - add timeout paramter to load_table_from_file and it dependent methods ([#​327](https://www.github.com/googleapis/python-bigquery/issues/327)) ([b0dd892](https://www.github.com/googleapis/python-bigquery/commit/b0dd892176e31ac25fddd15554b5bfa054299d4d)) - add to_api_repr method to Model ([#​326](https://www.github.com/googleapis/python-bigquery/issues/326)) ([fb401bd](https://www.github.com/googleapis/python-bigquery/commit/fb401bd94477323bba68cf252dd88166495daf54)) - allow client options to be set in magics context ([#​322](https://www.github.com/googleapis/python-bigquery/issues/322)) ([5178b55](https://www.github.com/googleapis/python-bigquery/commit/5178b55682f5e264bfc082cde26acb1fdc953a18)) ##### Bug Fixes - make TimePartitioning repr evaluable ([#​110](https://www.github.com/googleapis/python-bigquery/issues/110)) ([20f473b](https://www.github.com/googleapis/python-bigquery/commit/20f473bfff5ae98377f5d9cdf18bfe5554d86ff4)), closes [#​109](https://www.github.com/googleapis/python-bigquery/issues/109) - use version.py instead of pkg_resources.get_distribution ([#​307](https://www.github.com/googleapis/python-bigquery/issues/307)) ([b8f502b](https://www.github.com/googleapis/python-bigquery/commit/b8f502b14f21d1815697e4d57cf1225dfb4a7c5e)) ##### Performance Improvements - add size parameter for load table from dataframe and json methods ([#​280](https://www.github.com/googleapis/python-bigquery/issues/280)) ([3be78b7](https://www.github.com/googleapis/python-bigquery/commit/3be78b737add7111e24e912cd02fc6df75a07de6)) ##### Documentation - update clustering field docstrings ([#​286](https://www.github.com/googleapis/python-bigquery/issues/286)) ([5ea1ece](https://www.github.com/googleapis/python-bigquery/commit/5ea1ece2d911cdd1f3d9549ee01559ce8ed8269a)), closes [#​285](https://www.github.com/googleapis/python-bigquery/issues/285) - update snippets samples to support version 2.0 ([#​309](https://www.github.com/googleapis/python-bigquery/issues/309)) ([61634be](https://www.github.com/googleapis/python-bigquery/commit/61634be9bf9e3df7589fc1bfdbda87288859bb13)) ##### Dependencies - add protobuf dependency ([#​306](https://www.github.com/googleapis/python-bigquery/issues/306)) ([cebb5e0](https://www.github.com/googleapis/python-bigquery/commit/cebb5e0e911e8c9059bc8c9e7fce4440e518bff3)), closes [#​305](https://www.github.com/googleapis/python-bigquery/issues/305) - require pyarrow for pandas support ([#​314](https://www.github.com/googleapis/python-bigquery/issues/314)) ([801e4c0](https://www.github.com/googleapis/python-bigquery/commit/801e4c0574b7e421aa3a28cafec6fd6bcce940dd)), closes [#​265](https://www.github.com/googleapis/python-bigquery/issues/265)
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * docs(samples): fix README to accurately reflect the new repo after the move (#72) Fix README to accurately reflect the new repo after the move. - [X] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-dlp/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] *N/A* ~Ensure the tests and linter pass~ - [ ] *N/A* ~Code coverage does not decrease (if any source code was changed)~ - [X] Appropriate docs were updated (if necessary) Fixes google internal bug b/173536792 * chore(deps): update dependency google-cloud-storage to v1.33.0 (#71) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-storage](https://togithub.com/googleapis/python-storage) | minor | `==1.32.0` -> `==1.33.0` | --- ### Release Notes
googleapis/python-storage ### [`v1.33.0`](https://togithub.com/googleapis/python-storage/blob/master/CHANGELOG.md#​1330-httpswwwgithubcomgoogleapispython-storagecomparev1320v1330-2020-11-16) [Compare Source](https://togithub.com/googleapis/python-storage/compare/v1.32.0...v1.33.0) ##### Features - add classifiers for python3.9 and remove for python3.5 ([#​295](https://www.github.com/googleapis/python-storage/issues/295)) ([f072825](https://www.github.com/googleapis/python-storage/commit/f072825ce03d774fd95d9fe3db95a8c7130b0e8a)) - add testing support for Python 3.9, drop Python 3.5 ([#​313](https://www.github.com/googleapis/python-storage/issues/313)) ([fa14009](https://www.github.com/googleapis/python-storage/commit/fa140092877a277abbb23785657590a274a86d61)) ##### Bug Fixes - use passed-in `client` within `Blob.from_string` and helpers ([#​290](https://www.github.com/googleapis/python-storage/issues/290)) ([d457ce3](https://www.github.com/googleapis/python-storage/commit/d457ce3e161555c9117ae288ec0c9cd5f8d5fe3a)), closes [#​286](https://www.github.com/googleapis/python-storage/issues/286) - preserve `metadata` value when uploading new file content ([#​298](https://www.github.com/googleapis/python-storage/issues/298)) ([5ab6b0d](https://www.github.com/googleapis/python-storage/commit/5ab6b0d9a2b27ae830740a7a0226fc1e241e9ec4)), closes [#​293](https://www.github.com/googleapis/python-storage/issues/293)
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.4.0 (#68) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | minor | `==2.2.0` -> `==2.4.0` | --- ### Release Notes
googleapis/python-bigquery ### [`v2.4.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​240-httpswwwgithubcomgoogleapispython-bigquerycomparev231v240-2020-11-16) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.3.1...v2.4.0) ##### Features - add progress bar to `QueryJob.to_dataframe` and `to_arrow` ([#​352](https://www.github.com/googleapis/python-bigquery/issues/352)) ([dc78edd](https://www.github.com/googleapis/python-bigquery/commit/dc78eddde7a6a312c8fed7bace7d64036837ab1a)) - allow routine references ([#​378](https://www.github.com/googleapis/python-bigquery/issues/378)) ([f9480dc](https://www.github.com/googleapis/python-bigquery/commit/f9480dc2a1bc58367083176bd74725aa8b903301)) ##### Bug Fixes - **dbapi:** allow rows to be fetched from scripts ([#​387](https://www.github.com/googleapis/python-bigquery/issues/387)) ([b899ad1](https://www.github.com/googleapis/python-bigquery/commit/b899ad12e17cb87c58d3ae46b4388d917c5743f2)), closes [#​377](https://www.github.com/googleapis/python-bigquery/issues/377) ##### Performance Improvements - avoid extra API calls from `to_dataframe` if all rows are cached ([#​384](https://www.github.com/googleapis/python-bigquery/issues/384)) ([c52b317](https://www.github.com/googleapis/python-bigquery/commit/c52b31789998fc0dfde07c3296650c85104d719d)) - cache first page of `jobs.getQueryResults` rows ([#​374](https://www.github.com/googleapis/python-bigquery/issues/374)) ([86f6a51](https://www.github.com/googleapis/python-bigquery/commit/86f6a516d1c7c5dc204ab085ea2578793e6561ff)) - use `getQueryResults` from DB-API ([#​375](https://www.github.com/googleapis/python-bigquery/issues/375)) ([30de15f](https://www.github.com/googleapis/python-bigquery/commit/30de15f7255de5ea221df4e8db7991d279e0ea28)) ##### Dependencies - expand pyarrow dependencies to include version 2 ([#​368](https://www.github.com/googleapis/python-bigquery/issues/368)) ([cd9febd](https://www.github.com/googleapis/python-bigquery/commit/cd9febd20c34983781386c3bf603e5fca7135695)) ### [`v2.3.1`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​231) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.2.0...v2.3.1) 11-05-2020 09:27 PST ##### Internal / Testing Changes - update `google.cloud.bigquery.__version__`
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * test(samples): retry flaky test 5 times (#77) * chore(samples): add samples for custom infotype rules (#76) Add missing python samples for https://cloud.google.com/dlp/docs/creating-custom-infotypes-rules#dlp_inspect_string_without_overlap-java - [x] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-dlp/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [x] Ensure the tests and linter pass - Some tests in other files are already broken, but `pytest custom_infotype_test.py` succeeds. - [x] Code coverage does not decrease (if any source code was changed) - [x] Appropriate docs were updated (if necessary) Fixes internal bugs: b/156968601 b/156970772 b/156970547 b/156969966 b/156970576 b/156974339 b/156969968 * fix!: rename fields that collide with builtins; retrieve job config for risk analysis jobs (#75) fix: retrieve job config for risk analysis jobs fix!: rename fields that collide with builtins. * `ByteContentItem.type` -> `ByteContentItem.type_` * `MetadataLocation.type` -> `MetadataLocation.type_` * `Container.type` -> `Container.type_` * `Bucket.min` -> `Bucket.min_` * `Bucket.max `-> `Bucket.max_` * `DlpJob.type` -> `DlpJob.type_` * `GetDlpJobRequest.type` -> `GetDlpJobRequest.type_` * chore(deps): update dependency google-cloud-dlp to v3 (#82) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-dlp](https://togithub.com/googleapis/python-dlp) | major | `==2.0.0` -> `==3.0.0` | --- ### Release Notes
googleapis/python-dlp ### [`v3.0.0`](https://togithub.com/googleapis/python-dlp/blob/master/CHANGELOG.md#​300-httpswwwgithubcomgoogleapispython-dlpcomparev200v300-2020-12-02) [Compare Source](https://togithub.com/googleapis/python-dlp/compare/v2.0.0...v3.0.0) ##### ⚠ BREAKING CHANGES - rename fields that collide with builtins ([#​75](https://togithub.com/googleapis/python-dlp/issues/75)) - `ByteContentItem.type` -> `ByteContentItem.type_` - `MetadataLocation.type` -> `MetadataLocation.type_` - `Container.type` -> `Container.type_` - `Bucket.min` -> `Bucket.min_` - `Bucket.max`-> `Bucket.max_` - `DlpJob.type` -> `DlpJob.type_` - `GetDlpJobRequest.type` -> `GetDlpJobRequest.type_` ##### Bug Fixes - rename fields that collide with builtins; retrieve job config for risk analysis jobs ([#​75](https://www.github.com/googleapis/python-dlp/issues/75)) ([4f3148e](https://www.github.com/googleapis/python-dlp/commit/4f3148e93ec3dfc9395aa38a3afc62498500a055)) ##### Documentation - **samples:** fix README to accurately reflect the new repo after the move ([#​72](https://www.github.com/googleapis/python-dlp/issues/72)) ([dc56806](https://www.github.com/googleapis/python-dlp/commit/dc56806b47f92227e396969d8a583b881aa41fd1))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.5.0 (#83) * chore(deps): update dependency google-cloud-bigquery to v2.6.0 (#85) * chore(deps): update dependency google-cloud-bigquery to v2.6.1 (#86) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | patch | `==2.6.0` -> `==2.6.1` | --- ### Release Notes
googleapis/python-bigquery ### [`v2.6.1`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​261-httpswwwgithubcomgoogleapispython-bigquerycomparev260v261-2020-12-09) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.6.0...v2.6.1)
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-storage to v1.35.0 (#88) * fix: remove gRPC send/recv limits; add enums to `types/__init__.py` (#89) * chore: update templates (#87) * chore(deps): update dependency google-cloud-pubsub to v2 (#54) * chore(deps): update dependency google-cloud-pubsub to v2 * pubsub v2 fix * run blacken Co-authored-by: Leah Cole * chore(deps): update dependency google-cloud-datastore to v2 (#69) * chore(deps): update dependency google-cloud-datastore to v2.1.0 (#93) * chore(deps): update dependency google-cloud-bigquery to v2.6.2 (#97) * chore(deps): update dependency google-cloud-bigquery to v2.7.0 (#100) * chore(deps): update dependency google-cloud-dlp to v3.0.1 (#102) * chore(deps): update dependency google-cloud-bigquery to v2.8.0 (#107) * chore(deps): update dependency google-cloud-pubsub to v2.3.0 (#106) * chore(deps): update dependency google-cloud-storage to v1.36.0 (#105) * chore(deps): update dependency google-cloud-bigquery to v2.9.0 (#109) * chore(deps): update dependency google-cloud-storage to v1.36.1 (#111) * chore(deps): update dependency google-cloud-pubsub to v2.4.0 (#110) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-pubsub](https://togithub.com/googleapis/python-pubsub) | `==2.3.0` -> `==2.4.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.4.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.4.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.4.0/compatibility-slim/2.3.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.4.0/confidence-slim/2.3.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-pubsub ### [`v2.4.0`](https://togithub.com/googleapis/python-pubsub/blob/master/CHANGELOG.md#​240) [Compare Source](https://togithub.com/googleapis/python-pubsub/compare/v2.3.0...v2.4.0) 02-22-2021 05:02 PST ##### Implementation Changes ##### New Features - Add graceful streaming pull shutdown. ([#​292](https://togithub.com/googleapis/python-pubsub/pull/292)) ##### Documentation - Update samples with using the subscriber client as a context manager. ([#​254](https://togithub.com/googleapis/python-pubsub/pull/254))
--- ### Renovate configuration :date: **Schedule**: At any time (no schedule defined). :vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied. :recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. :no_bell: **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.10.0 (#112) * chore(deps): update dependency google-cloud-bigquery to v2.11.0 (#113) * chore(deps): update dependency google-cloud-storage to v1.36.2 (#114) * chore(deps): update dependency google-cloud-bigquery to v2.12.0 (#116) * chore(deps): update dependency google-cloud-bigquery to v2.13.0 (#117) * chore(deps): update dependency google-cloud-bigquery to v2.13.1 (#118) * chore(deps): update dependency google-cloud-storage to v1.37.0 (#120) * chore(deps): update dependency google-cloud-pubsub to v2.4.1 (#121) * feat: crypto_deterministic_config (#108) (#119) Example of of Crypto Deterministic Config using https://cloud.google.com/dlp/docs/pseudonymization#supported-methods to resolve https://github.com/googleapis/python-dlp/issues/108 * fix: use correct retry deadlines (#96) fix: require google-api-core>=1.22.2 * chore(deps): update dependency google-cloud-storage to v1.37.1 (#126) * chore: add constraints file check for python samples (#128) This is the sibling PR to https://github.com/GoogleCloudPlatform/python-docs-samples/pull/5611 and this is the issue opened for it https://github.com/GoogleCloudPlatform/python-docs-samples/issues/5549 If you look at the files in [this example repo](https://github.com/leahecole/testrepo-githubapp/pull/31/files), you'll see that renovate successfully opened a PR on three constraints files in `samples` directories and subdirectories, and properly ignored `constraints` files at the root level cc @tswast TODO: - [x] update renovate to check for samples/constraints.txt dependency updates - [x] run lint locally to double check that I'm not introducing lint error Source-Author: Leah E. Cole <6719667+leahecole@users.noreply.github.com> Source-Date: Fri Apr 9 22:50:04 2021 -0700 Source-Repo: googleapis/synthtool Source-Sha: 0a071b3460344886297a304253bf924aa68ddb7e Source-Link: https://github.com/googleapis/synthtool/commit/0a071b3460344886297a304253bf924aa68ddb7e * chore(deps): update dependency google-cloud-storage to v1.38.0 (#136) * chore(deps): update dependency google-cloud-bigquery to v2.14.0 (#135) * chore(deps): update dependency mock to v4.0.3 (#132) * chore(deps): update dependency pytest to v6.2.3 (#133) * chore(deps): update dependency google-cloud-pubsub to v2.4.2 (#148) * chore(deps): update dependency pytest to v6.2.4 (#145) * chore(deps): update dependency google-cloud-bigquery to v2.16.1 (#139) * chore(deps): update dependency google-cloud-datastore to v2.1.2 (#144) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-datastore](https://togithub.com/googleapis/python-datastore) | `==2.1.0` -> `==2.1.2` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.2/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.2/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.2/compatibility-slim/2.1.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.2/confidence-slim/2.1.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-datastore ### [`v2.1.2`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​212-httpswwwgithubcomgoogleapispython-datastorecomparev211v212-2021-05-03) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v2.1.1...v2.1.2) ### [`v2.1.1`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​211-httpswwwgithubcomgoogleapispython-datastorecomparev210v211-2021-04-20) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v2.1.0...v2.1.1)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻️ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore: new owl bot post processor docker image (#155) gcr.io/repo-automation-bots/owlbot-python:latest@sha256:3c3a445b3ddc99ccd5d31edc4b4519729635d20693900db32c4f587ed51f7479 * chore(deps): update dependency google-cloud-datastore to v2.1.3 (#157) * chore(deps): update dependency google-cloud-bigquery to v2.17.0 (#153) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | `==2.16.1` -> `==2.17.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.17.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.17.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.17.0/compatibility-slim/2.16.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.17.0/confidence-slim/2.16.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-bigquery ### [`v2.17.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2170-httpswwwgithubcomgoogleapispython-bigquerycomparev2161v2170-2021-05-21) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.16.1...v2.17.0) ##### Features - detect obsolete BQ Storage extra at runtime ([#​666](https://www.github.com/googleapis/python-bigquery/issues/666)) ([bd7dbda](https://www.github.com/googleapis/python-bigquery/commit/bd7dbdae5c972b16bafc53c67911eeaa3255a880)) - Support parameterized NUMERIC, BIGNUMERIC, STRING, and BYTES types ([#​673](https://www.github.com/googleapis/python-bigquery/issues/673)) ([45421e7](https://www.github.com/googleapis/python-bigquery/commit/45421e73bfcddb244822e6a5cd43be6bd1ca2256)) ##### Bug Fixes - **tests:** invalid path to strptime() ([#​672](https://www.github.com/googleapis/python-bigquery/issues/672)) ([591cdd8](https://www.github.com/googleapis/python-bigquery/commit/591cdd851bb1321b048a05a378a0ef48d3ade462)) ##### [2.16.1](https://www.github.com/googleapis/python-bigquery/compare/v2.16.0...v2.16.1) (2021-05-12) ##### Bug Fixes - executemany rowcount only reflected the last execution ([#​660](https://www.github.com/googleapis/python-bigquery/issues/660)) ([aeadc8c](https://www.github.com/googleapis/python-bigquery/commit/aeadc8c2d614bb9f0883ec901fca48930f3aaf19))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻️ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-pubsub to v2.5.0 (#152) * chore(deps): update dependency google-cloud-bigquery to v2.18.0 (#160) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | `==2.17.0` -> `==2.18.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.18.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.18.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.18.0/compatibility-slim/2.17.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.18.0/confidence-slim/2.17.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-bigquery ### [`v2.18.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2180-httpswwwgithubcomgoogleapispython-bigquerycomparev2170v2180-2021-06-02) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.17.0...v2.18.0) ##### Features - add support for Parquet options ([#​679](https://www.github.com/googleapis/python-bigquery/issues/679)) ([d792ce0](https://www.github.com/googleapis/python-bigquery/commit/d792ce09388a6ee3706777915dd2818d4c854f79))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻️ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-dlp to v3.1.0 (#159) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-dlp](https://togithub.com/googleapis/python-dlp) | `==3.0.1` -> `==3.1.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.0/compatibility-slim/3.0.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.0/confidence-slim/3.0.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-dlp ### [`v3.1.0`](https://togithub.com/googleapis/python-dlp/blob/master/CHANGELOG.md#​310-httpswwwgithubcomgoogleapispython-dlpcomparev301v310-2021-05-28) [Compare Source](https://togithub.com/googleapis/python-dlp/compare/v3.0.1...v3.1.0) ##### Features - crypto_deterministic_config ([#​108](https://www.github.com/googleapis/python-dlp/issues/108)) ([#​119](https://www.github.com/googleapis/python-dlp/issues/119)) ([396804d](https://www.github.com/googleapis/python-dlp/commit/396804d65e40c1ae9ced16aa0f04ef4bdffa54c5)) - support self-signed JWT flow for service accounts ([cdea974](https://www.github.com/googleapis/python-dlp/commit/cdea9744d0bc7244a42894acc1446080a16b2dab)) ##### Bug Fixes - add async client ([cdea974](https://www.github.com/googleapis/python-dlp/commit/cdea9744d0bc7244a42894acc1446080a16b2dab)) - require google-api-core>=1.22.2 ([d146cf5](https://www.github.com/googleapis/python-dlp/commit/d146cf59db14b3c3afbef72d7a86419532ad347e)) - use correct retry deadlines ([#​96](https://www.github.com/googleapis/python-dlp/issues/96)) ([d146cf5](https://www.github.com/googleapis/python-dlp/commit/d146cf59db14b3c3afbef72d7a86419532ad347e)) ##### [3.0.1](https://www.github.com/googleapis/python-dlp/compare/v3.0.0...v3.0.1) (2021-01-28) ##### Bug Fixes - remove gRPC send/recv limits; add enums to `types/__init__.py` ([#​89](https://www.github.com/googleapis/python-dlp/issues/89)) ([76e0439](https://www.github.com/googleapis/python-dlp/commit/76e0439b3acfdacf9303595107c03c1d49eac8b6))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻️ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.20.0 (#161) * chore(deps): update dependency google-cloud-dlp to v3.1.1 (#165) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-dlp](https://togithub.com/googleapis/python-dlp) | `==3.1.0` -> `==3.1.1` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.1/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.1/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.1/compatibility-slim/3.1.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.1.1/confidence-slim/3.1.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-dlp ### [`v3.1.1`](https://togithub.com/googleapis/python-dlp/blob/master/CHANGELOG.md#​311-httpswwwgithubcomgoogleapispython-dlpcomparev310v311-2021-06-16) [Compare Source](https://togithub.com/googleapis/python-dlp/compare/v3.1.0...v3.1.1)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-storage to v1.39.0 (#173) * chore(deps): update dependency google-cloud-pubsub to v2.6.0 (#170) * chore(deps): update dependency google-cloud-storage to v1.40.0 (#178) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-storage](https://togithub.com/googleapis/python-storage) | `==1.39.0` -> `==1.40.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.40.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.40.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.40.0/compatibility-slim/1.39.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.40.0/confidence-slim/1.39.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-storage ### [`v1.40.0`](https://togithub.com/googleapis/python-storage/blob/master/CHANGELOG.md#​1400-httpswwwgithubcomgoogleapispython-storagecomparev1390v1400-2021-06-30) [Compare Source](https://togithub.com/googleapis/python-storage/compare/v1.39.0...v1.40.0) ##### Features - add preconditions and retry configuration to blob.create_resumable_upload_session ([#​484](https://www.github.com/googleapis/python-storage/issues/484)) ([0ae35ee](https://www.github.com/googleapis/python-storage/commit/0ae35eef0fe82fe60bc095c4b183102bb1dabeeb)) - add public access prevention to bucket IAM configuration ([#​304](https://www.github.com/googleapis/python-storage/issues/304)) ([e3e57a9](https://www.github.com/googleapis/python-storage/commit/e3e57a9c779d6b87852063787f19e27c76b1bb14)) ##### Bug Fixes - replace default retry for upload operations ([#​480](https://www.github.com/googleapis/python-storage/issues/480)) ([c027ccf](https://www.github.com/googleapis/python-storage/commit/c027ccf4279fb05e041754294f10744b7d81beea))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-datastore to v2.1.4 (#181) * chore(deps): update dependency google-cloud-pubsub to v2.6.1 (#182) * chore(deps): update dependency backoff to v1.11.0 (#183) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [backoff](https://togithub.com/litl/backoff) | `==1.10.0` -> `==1.11.0` | [![age](https://badges.renovateapi.com/packages/pypi/backoff/1.11.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/backoff/1.11.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/backoff/1.11.0/compatibility-slim/1.10.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/backoff/1.11.0/confidence-slim/1.10.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
litl/backoff ### [`v1.11.0`](https://togithub.com/litl/backoff/blob/master/CHANGELOG.md#v1110-2021-07-12) [Compare Source](https://togithub.com/litl/backoff/compare/v1.10.0...v1.11.0) ##### Changed - Configurable logging levels for backoff and giveup events - Minor documentation fixes ##### NOTE THIS WILL BE THE FINAL PYTHON 2.7 COMPATIBLE RELEASE.
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-dlp to v3.2.0 (#184) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-dlp](https://togithub.com/googleapis/python-dlp) | `==3.1.1` -> `==3.2.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.0/compatibility-slim/3.1.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.0/confidence-slim/3.1.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-dlp ### [`v3.2.0`](https://togithub.com/googleapis/python-dlp/blob/master/CHANGELOG.md#​320-httpswwwgithubcomgoogleapispython-dlpcomparev311v320-2021-07-12) [Compare Source](https://togithub.com/googleapis/python-dlp/compare/v3.1.1...v3.2.0) ##### Features - add always_use_jwt_access ([#​172](https://www.github.com/googleapis/python-dlp/issues/172)) ([fb86805](https://www.github.com/googleapis/python-dlp/commit/fb8680580a16b088fd680355e85f12593372b9a4)) ##### Bug Fixes - disable always_use_jwt_access ([#​177](https://www.github.com/googleapis/python-dlp/issues/177)) ([15f189f](https://www.github.com/googleapis/python-dlp/commit/15f189fdbbb8f9445bd88e3675c3f1e65da84aad)) ##### Documentation - omit mention of Python 2.7 in 'CONTRIBUTING.rst' ([#​1127](https://www.github.com/googleapis/python-dlp/issues/1127)) ([#​166](https://www.github.com/googleapis/python-dlp/issues/166)) ([e2e1c90](https://www.github.com/googleapis/python-dlp/commit/e2e1c90d65a2e2e9c1be1ed7921e138059401519)) ##### [3.1.1](https://www.github.com/googleapis/python-dlp/compare/v3.1.0...v3.1.1) (2021-06-16) ##### Bug Fixes - **deps:** add packaging requirement ([#​162](https://www.github.com/googleapis/python-dlp/issues/162)) ([e857e15](https://www.github.com/googleapis/python-dlp/commit/e857e1522d9fd59c1b4c5d9936c7371ddf8018b1))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.21.0 (#203) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | `==2.20.0` -> `==2.21.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.21.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.21.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.21.0/compatibility-slim/2.20.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.21.0/confidence-slim/2.20.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-bigquery ### [`v2.21.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2210-httpswwwgithubcomgoogleapispython-bigquerycomparev2200v2210-2021-07-12) ##### Features - Add max_results parameter to some of the `QueryJob` methods. ([#​698](https://www.github.com/googleapis/python-bigquery/issues/698)) ([2a9618f](https://www.github.com/googleapis/python-bigquery/commit/2a9618f4daaa4a014161e1a2f7376844eec9e8da)) - Add support for decimal target types. ([#​735](https://www.github.com/googleapis/python-bigquery/issues/735)) ([7d2d3e9](https://www.github.com/googleapis/python-bigquery/commit/7d2d3e906a9eb161911a198fb925ad79de5df934)) - Add support for table snapshots. ([#​740](https://www.github.com/googleapis/python-bigquery/issues/740)) ([ba86b2a](https://www.github.com/googleapis/python-bigquery/commit/ba86b2a6300ae5a9f3c803beeb42bda4c522e34c)) - Enable unsetting policy tags on schema fields. ([#​703](https://www.github.com/googleapis/python-bigquery/issues/703)) ([18bb443](https://www.github.com/googleapis/python-bigquery/commit/18bb443c7acd0a75dcb57d9aebe38b2d734ff8c7)) - Make it easier to disable best-effort deduplication with streaming inserts. ([#​734](https://www.github.com/googleapis/python-bigquery/issues/734)) ([1246da8](https://www.github.com/googleapis/python-bigquery/commit/1246da86b78b03ca1aa2c45ec71649e294cfb2f1)) - Support passing struct data to the DB API. ([#​718](https://www.github.com/googleapis/python-bigquery/issues/718)) ([38b3ef9](https://www.github.com/googleapis/python-bigquery/commit/38b3ef96c3dedc139b84f0ff06885141ae7ce78c)) ##### Bug Fixes - Inserting non-finite floats with `insert_rows()`. ([#​728](https://www.github.com/googleapis/python-bigquery/issues/728)) ([d047419](https://www.github.com/googleapis/python-bigquery/commit/d047419879e807e123296da2eee89a5253050166)) - Use `pandas` function to check for `NaN`. ([#​750](https://www.github.com/googleapis/python-bigquery/issues/750)) ([67bc5fb](https://www.github.com/googleapis/python-bigquery/commit/67bc5fbd306be7cdffd216f3791d4024acfa95b3)) ##### Documentation - Add docs for all enums in module. ([#​745](https://www.github.com/googleapis/python-bigquery/issues/745)) ([145944f](https://www.github.com/googleapis/python-bigquery/commit/145944f24fedc4d739687399a8309f9d51d43dfd)) - Omit mention of Python 2.7 in `CONTRIBUTING.rst`. ([#​706](https://www.github.com/googleapis/python-bigquery/issues/706)) ([27d6839](https://www.github.com/googleapis/python-bigquery/commit/27d6839ee8a40909e4199cfa0da8b6b64705b2e9))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency backoff to v1.11.1 (#205) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [backoff](https://togithub.com/litl/backoff) | `==1.11.0` -> `==1.11.1` | [![age](https://badges.renovateapi.com/packages/pypi/backoff/1.11.1/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/backoff/1.11.1/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/backoff/1.11.1/compatibility-slim/1.11.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/backoff/1.11.1/confidence-slim/1.11.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
litl/backoff ### [`v1.11.1`](https://togithub.com/litl/backoff/blob/master/CHANGELOG.md#v1111-2021-07-14) [Compare Source](https://togithub.com/litl/backoff/compare/v1.11.0...v1.11.1) ##### Changed - Update **version** in backoff module
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-storage to v1.41.0 (#204) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-storage](https://togithub.com/googleapis/python-storage) | `==1.40.0` -> `==1.41.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.41.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.41.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.41.0/compatibility-slim/1.40.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.41.0/confidence-slim/1.40.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-storage ### [`v1.41.0`](https://togithub.com/googleapis/python-storage/blob/master/CHANGELOG.md#​1410-httpswwwgithubcomgoogleapispython-storagecomparev1400v1410-2021-07-13) [Compare Source](https://togithub.com/googleapis/python-storage/compare/v1.40.0...v1.41.0) ##### Features - add support for Etag headers on reads ([#​489](https://www.github.com/googleapis/python-storage/issues/489)) ([741d3fd](https://www.github.com/googleapis/python-storage/commit/741d3fda4e4280022cede29ebeb7c2ea09e73b6f)) ##### Bug Fixes - **deps:** update minimum dependency versions to pick up bugfixes ([#​496](https://www.github.com/googleapis/python-storage/issues/496)) ([92251a5](https://www.github.com/googleapis/python-storage/commit/92251a5c8ea4d663773506eb1c630201a657aa69)), closes [#​494](https://www.github.com/googleapis/python-storage/issues/494) - populate etag / generation / metageneration properties during download ([#​488](https://www.github.com/googleapis/python-storage/issues/488)) ([49ba14c](https://www.github.com/googleapis/python-storage/commit/49ba14c9c47dbe6bc2bb45d53bbe5621c131fbcb)) - revise and rename is_etag_in_json(data) ([#​483](https://www.github.com/googleapis/python-storage/issues/483)) ([0a52546](https://www.github.com/googleapis/python-storage/commit/0a5254647bf1155874fe48f3891bcc34a76b0b81))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.22.0 (#208) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | `==2.21.0` -> `==2.22.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.22.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.22.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.22.0/compatibility-slim/2.21.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.22.0/confidence-slim/2.21.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-bigquery ### [`v2.22.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2220-httpswwwgithubcomgoogleapispython-bigquerycomparev2210v2220-2021-07-19) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.21.0...v2.22.0) ##### Features - add `LoadJobConfig.projection_fields` to select DATASTORE_BACKUP fields ([#​736](https://www.github.com/googleapis/python-bigquery/issues/736)) ([c45a738](https://www.github.com/googleapis/python-bigquery/commit/c45a7380871af3dfbd3c45524cb606c60e1a01d1)) - add standard sql table type, update scalar type enums ([#​777](https://www.github.com/googleapis/python-bigquery/issues/777)) ([b8b5433](https://www.github.com/googleapis/python-bigquery/commit/b8b5433898ec881f8da1303614780a660d94733a)) - add support for more detailed DML stats ([#​758](https://www.github.com/googleapis/python-bigquery/issues/758)) ([36fe86f](https://www.github.com/googleapis/python-bigquery/commit/36fe86f41c1a8f46167284f752a6d6bbf886a04b)) - add support for user defined Table View Functions ([#​724](https://www.github.com/googleapis/python-bigquery/issues/724)) ([8c7b839](https://www.github.com/googleapis/python-bigquery/commit/8c7b839a6ac1491c1c3b6b0e8755f4b70ed72ee3)) ##### Bug Fixes - avoid possible job already exists error ([#​751](https://www.github.com/googleapis/python-bigquery/issues/751)) ([45b9308](https://www.github.com/googleapis/python-bigquery/commit/45b93089f5398740413104285cc8acfd5ebc9c08)) ##### Dependencies - allow 2.x versions of `google-api-core`, `google-cloud-core`, `google-resumable-media` ([#​770](https://www.github.com/googleapis/python-bigquery/issues/770)) ([87a09fa](https://www.github.com/googleapis/python-bigquery/commit/87a09fa3f2a9ab35728a1ac925f9d5f2e6616c65)) ##### Documentation - add loading data from Firestore backup sample ([#​737](https://www.github.com/googleapis/python-bigquery/issues/737)) ([22fd848](https://www.github.com/googleapis/python-bigquery/commit/22fd848cae4af1148040e1faa31dd15a4d674687))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-storage to v1.41.1 (#212) * feat: add Samples section to CONTRIBUTING.rst (#210) Source-Link: https://github.com/googleapis/synthtool/commit/52e4e46eff2a0b70e3ff5506a02929d089d077d4 Post-Processor: gcr.io/repo-automation-bots/owlbot-python:latest@sha256:6186535cbdbf6b9fe61f00294929221d060634dae4a0795c1cefdbc995b2d605 * chore(deps): update dependency google-cloud-datastore to v2.1.5 (#213) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-datastore](https://togithub.com/googleapis/python-datastore) | `==2.1.4` -> `==2.1.5` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.5/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.5/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.5/compatibility-slim/2.1.4)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.5/confidence-slim/2.1.4)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-datastore ### [`v2.1.5`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​215-httpswwwgithubcomgoogleapispython-datastorecomparev214v215-2021-07-20) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v2.1.4...v2.1.5)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-dlp to v3.2.1 (#214) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-dlp](https://togithub.com/googleapis/python-dlp) | `==3.2.0` -> `==3.2.1` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.1/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.1/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.1/compatibility-slim/3.2.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-dlp/3.2.1/confidence-slim/3.2.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-dlp ### [`v3.2.1`](https://togithub.com/googleapis/python-dlp/blob/master/CHANGELOG.md#​321-httpswwwgithubcomgoogleapispython-dlpcomparev320v321-2021-07-21) [Compare Source](https://togithub.com/googleapis/python-dlp/compare/v3.2.0...v3.2.1)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-bigquery to v2.22.1 (#216) * chore(deps): update dependency google-cloud-bigquery to v2.23.0 (#221) * chore(deps): update dependency google-cloud-datastore to v2.1.6 (#220) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-datastore](https://togithub.com/googleapis/python-datastore) | `==2.1.5` -> `==2.1.6` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.6/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.6/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.6/compatibility-slim/2.1.5)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-datastore/2.1.6/confidence-slim/2.1.5)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-datastore ### [`v2.1.6`](https://togithub.com/googleapis/python-datastore/blob/master/CHANGELOG.md#​216-httpswwwgithubcomgoogleapispython-datastorecomparev215v216-2021-07-26) [Compare Source](https://togithub.com/googleapis/python-datastore/compare/v2.1.5...v2.1.6)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-dlp to v3.2.2 (#223) * chore(deps): update dependency google-cloud-pubsub to v2.7.0 (#225) * chore(deps): update dependency google-cloud-bigquery to v2.23.1 (#224) * chore(deps): update dependency google-cloud-bigquery to v2.23.2 (#226) * chore(deps): update dependency google-cloud-bigquery to v2.23.3 (#228) * chore: fix INSTALL_LIBRARY_FROM_SOURCE in noxfile.py (#229) Source-Link: https://github.com/googleapis/synthtool/commit/6252f2cd074c38f37b44abe5e96d128733eb1b61 Post-Processor: gcr.io/repo-automation-bots/owlbot-python:latest@sha256:50e35228649c47b6ca82aa0be3ff9eb2afce51c82b66c4a03fe4afeb5ff6c0fc * chore(deps): update dependency google-cloud-bigquery to v2.24.0 (#232) * chore(deps): update dependency google-cloud-storage to v1.42.0 (#231) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-storage](https://togithub.com/googleapis/python-storage) | `==1.41.1` -> `==1.42.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.0/compatibility-slim/1.41.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.0/confidence-slim/1.41.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-storage ### [`v1.42.0`](https://togithub.com/googleapis/python-storage/blob/master/CHANGELOG.md#​1420-httpswwwgithubcomgoogleapispython-storagecomparev1411v1420-2021-08-05) [Compare Source](https://togithub.com/googleapis/python-storage/compare/v1.41.1...v1.42.0) ##### Features - add 'page_size' parameter to 'Bucket.list_blobs, list_buckets ([#​520](https://www.github.com/googleapis/python-storage/issues/520)) ([c5f4ad8](https://www.github.com/googleapis/python-storage/commit/c5f4ad8fddd1849a4229b0126c4c022bccb90128)) ##### Bug Fixes - **deps:** add explicit ranges for 'google-api-core' and 'google-auth' ([#​530](https://www.github.com/googleapis/python-storage/issues/530)) ([310f207](https://www.github.com/googleapis/python-storage/commit/310f207411da0382af310172344f19c644c14e6a)) - downloading no longer marks metadata fields as 'changed' ([#​523](https://www.github.com/googleapis/python-storage/issues/523)) ([160d1ec](https://www.github.com/googleapis/python-storage/commit/160d1ecb41f1f269b25cb68b2d2f7daf418bf01c)) - make 'requests.exceptions.ChunkedEncodingError retryable by default ([#​526](https://www.github.com/googleapis/python-storage/issues/526)) ([4abb403](https://www.github.com/googleapis/python-storage/commit/4abb40310eca7ec45afc4bc5e4dfafbe083e74d2)) ##### Documentation - update supported / removed Python versions in README ([#​519](https://www.github.com/googleapis/python-storage/issues/519)) ([1f1b138](https://www.github.com/googleapis/python-storage/commit/1f1b138865fb171535ee0cf768aff1987ed58914)) ##### [1.41.1](https://www.github.com/googleapis/python-storage/compare/v1.41.0...v1.41.1) (2021-07-20) ##### Bug Fixes - **deps:** pin `{api,cloud}-core`, `auth` to allow 2.x versions on Python 3 ([#​512](https://www.github.com/googleapis/python-storage/issues/512)) ([4d7500e](https://www.github.com/googleapis/python-storage/commit/4d7500e39c51efd817b8363b69c88be040f3edb8)) - remove trailing commas from error message constants ([#​505](https://www.github.com/googleapis/python-storage/issues/505)) ([d4a86ce](https://www.github.com/googleapis/python-storage/commit/d4a86ceb7a7c5e00ba7bae37c7078d52478040ff)), closes [#​501](https://www.github.com/googleapis/python-storage/issues/501) ##### Documentation - replace usage of deprecated function `download_as_string` in docs ([#​508](https://www.github.com/googleapis/python-storage/issues/508)) ([8dfa4d4](https://www.github.com/googleapis/python-storage/commit/8dfa4d429dce94b671dc3e3755e52ab82733f61a))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore: drop mention of Python 2.7 from templates (#233) Source-Link: https://github.com/googleapis/synthtool/commit/facee4cc1ea096cd8bcc008bb85929daa7c414c0 Post-Processor: gcr.io/repo-automation-bots/owlbot-python:latest@sha256:9743664022bd63a8084be67f144898314c7ca12f0a03e422ac17c733c129d803 Co-authored-by: Owl Bot * chore(deps): update dependency google-cloud-pubsub to v2.7.1 (#235) Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-bigquery to v2.25.1 (#234) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | `==2.24.0` -> `==2.25.1` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.25.1/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.25.1/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.25.1/compatibility-slim/2.24.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/2.25.1/confidence-slim/2.24.0)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-bigquery ### [`v2.25.1`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2251-httpswwwgithubcomgoogleapispython-bigquerycomparev2250v2251-2021-08-25) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.25.0...v2.25.1) ### [`v2.25.0`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2250-httpswwwgithubcomgoogleapispython-bigquerycomparev2241v2250-2021-08-24) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.24.1...v2.25.0) ##### Features - Support using GeoPandas for GEOGRAPHY columns ([#​848](https://www.togithub.com/googleapis/python-bigquery/issues/848)) ([16f65e6](https://www.github.com/googleapis/python-bigquery/commit/16f65e6ae15979217ceea6c6d398c9057a363a13)) ##### [2.24.1](https://www.github.com/googleapis/python-bigquery/compare/v2.24.0...v2.24.1) (2021-08-13) ##### Bug Fixes - remove pytz dependency and require pyarrow>=3.0.0 ([#​875](https://www.togithub.com/googleapis/python-bigquery/issues/875)) ([2cb3563](https://www.github.com/googleapis/python-bigquery/commit/2cb3563ee863edef7eaf5d04d739bcfe7bc6438e)) ### [`v2.24.1`](https://togithub.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#​2241-httpswwwgithubcomgoogleapispython-bigquerycomparev2240v2241-2021-08-13) [Compare Source](https://togithub.com/googleapis/python-bigquery/compare/v2.24.0...v2.24.1)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency pytest to v6.2.5 (#242) * chore(deps): update dependency google-cloud-bigquery to v2.25.2 (#243) * chore(deps): update dependency google-cloud-bigquery to v2.26.0 (#246) * chore(deps): update dependency google-cloud-pubsub to v2.8.0 (#248) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-pubsub](https://togithub.com/googleapis/python-pubsub) | `==2.7.1` -> `==2.8.0` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.8.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.8.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.8.0/compatibility-slim/2.7.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-pubsub/2.8.0/confidence-slim/2.7.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-pubsub ### [`v2.8.0`](https://togithub.com/googleapis/python-pubsub/blob/master/CHANGELOG.md#​280-httpswwwgithubcomgoogleapispython-pubsubcomparev271v280-2021-09-02) [Compare Source](https://togithub.com/googleapis/python-pubsub/compare/v2.7.1...v2.8.0) ##### Features - closed subscriber as context manager raises ([#​488](https://www.togithub.com/googleapis/python-pubsub/issues/488)) ([a05a3f2](https://www.github.com/googleapis/python-pubsub/commit/a05a3f250cf8567ffe0d2eb3ecc45856a2bcd07c)) ##### Documentation - clarify the types of Message parameters ([#​486](https://www.togithub.com/googleapis/python-pubsub/issues/486)) ([633e91b](https://www.github.com/googleapis/python-pubsub/commit/633e91bbfc0a8f4f484089acff6812b754f40c75)) ##### [2.7.1](https://www.github.com/googleapis/python-pubsub/compare/v2.7.0...v2.7.1) (2021-08-13) ##### Bug Fixes - remove dependency on pytz ([#​472](https://www.togithub.com/googleapis/python-pubsub/issues/472)) ([972cc16](https://www.github.com/googleapis/python-pubsub/commit/972cc163f5a1477b37a5ab7e329faf1468637fa2))
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-storage to v1.42.1 (#250) * chore: blacken samples noxfile template (#251) * chore(deps): update dependency google-cloud-storage to v1.42.2 (#252) [![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [google-cloud-storage](https://togithub.com/googleapis/python-storage) | `==1.42.1` -> `==1.42.2` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.2/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.2/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.2/compatibility-slim/1.42.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-storage/1.42.2/confidence-slim/1.42.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
googleapis/python-storage ### [`v1.42.2`](https://togithub.com/googleapis/python-storage/blob/master/CHANGELOG.md#​1422-httpswwwgithubcomgoogleapispython-storagecomparev1421v1422-2021-09-16) [Compare Source](https://togithub.com/googleapis/python-storage/compare/v1.42.1...v1.42.2)
--- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box. --- This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update all dependencies (#260) * chore: fail samples nox session if python version is missing (#263) * chore(deps): update dependency google-cloud-bigquery to v2.28.0 (#264) * chore(deps): update dependency google-cloud-storage to v1.42.3 (#265) * samples: increase timeout, catch concurrent.futures.TimeoutError (#266) * samples: catch concurrent.futures.TimeoutError * fix: use default timeout in samples file * chore(deps): update dependency google-cloud-dlp to v3.2.4 (#269) * chore(python): Add kokoro configs for python 3.10 samples testing (#273) * chore(deps): update all dependencies (#271) Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#283) * chore(deps): update all dependencies (#284) * chore(deps): update all dependencies * 🦉 Updates from OwlBot See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot * chore(deps): update dependency google-cloud-dlp to v3.3.1 (#287) * chore(deps): update dependency google-cloud-datastore to v2.4.0 (#288) * chore(deps): update dependency google-cloud-pubsub to v2.9.0 (#290) * chore(python): run blacken session for all directories with a noxfile (#291) * chore(python): run blacken session for all directories with a noxfile Source-Link: https://github.com/googleapis/synthtool/commit/bc0de6ee2489da6fb8eafd021a8c58b5cc30c947 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:39ad8c0570e4f5d2d3124a509de4fe975e799e2b97e0f58aed88f8880d5a8b60 * lint Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-storage to v1.43.0 (#294) * chore(deps): update dependency google-cloud-bigquery to v2.31.0 (#295) Co-authored-by: Anthonios Partheniou * chore(samples): Add check for tests in directory (#308) Source-Link: https://github.com/googleapis/synthtool/commit/52aef91f8d25223d9dbdb4aebd94ba8eea2101f3 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:36a95b8f494e4674dc9eee9af98961293b51b86b3649942aac800ae6c1f796d4 Co-authored-by: Owl Bot * chore(deps): update dependency google-cloud-dlp to v3.4.0 (#298) Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#312) Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-dlp to v3.5.0 (#314) * chore(python): Noxfile recognizes that tests can live in a folder (#315) Source-Link: https://github.com/googleapis/synthtool/commit/4760d8dce1351d93658cb11d02a1b7ceb23ae5d7 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:f0e4b51deef56bed74d3e2359c583fc104a8d6367da3984fc5c66938db738828 Co-authored-by: Owl Bot * chore(deps): update dependency google-cloud-storage to v2.1.0 (#317) * chore(deps): update dependency google-cloud-storage to v2.1.0 * add pin for google-cloud-storage for py36 * revert pin for py36 Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-dlp to v3.6.0 (#323) * chore: use gapic-generator-python 0.63.2 (#329) * chore: use gapic-generator-python 0.63.2 docs: add generated snippets PiperOrigin-RevId: 427792504 Source-Link: https://github.com/googleapis/googleapis/commit/55b9e1e0b3106c850d13958352bc0751147b6b15 Source-Link: https://github.com/googleapis/googleapis-gen/commit/bf4e86b753f42cb0edb1fd51fbe840d7da0a1cde Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYmY0ZTg2Yjc1M2Y0MmNiMGVkYjFmZDUxZmJlODQwZDdkYTBhMWNkZSJ9 * 🦉 Updates from OwlBot See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot * docs(dlp-samples): modified region tags and fixed comment (#330) * docs(dlp-samples): modified region tags and fixed comment * lint fix * chore(deps): update all dependencies (#328) * chore: use gapic-generator-python 0.63.4 (#331) * chore: use gapic-generator-python 0.63.4 chore: fix snippet region tag format chore: fix docstring code block formatting PiperOrigin-RevId: 430730865 Source-Link: https://github.com/googleapis/googleapis/commit/ea5800229f73f94fd7204915a86ed09dcddf429a Source-Link: https://github.com/googleapis/googleapis-gen/commit/ca893ff8af25fc7fe001de1405a517d80446ecca Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiY2E4OTNmZjhhZjI1ZmM3ZmUwMDFkZTE0MDVhNTE3ZDgwNDQ2ZWNjYSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * chore: delete duplicates Co-authored-by: Owl Bot Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: Anthonios Partheniou * chore: update copyright year to 2022 (#332) * chore: update copyright year to 2022 PiperOrigin-RevId: 431037888 Source-Link: https://github.com/googleapis/googleapis/commit/b3397f5febbf21dfc69b875ddabaf76bee765058 Source-Link: https://github.com/googleapis/googleapis-gen/commit/510b54e1cdefd53173984df16645081308fe897e Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNTEwYjU0ZTFjZGVmZDUzMTczOTg0ZGYxNjY0NTA4MTMwOGZlODk3ZSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot * chore(deps): update all dependencies (#334) * chore(deps): update all dependencies * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot * chore(deps): update all dependencies (#340) * chore(deps): update all dependencies * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot * chore: Adding support for pytest-xdist and pytest-parallel (#343) Source-Link: https://github.com/googleapis/synthtool/commit/82f5cb283efffe96e1b6cd634738e0e7de2cd90a Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:5d8da01438ece4021d135433f2cf3227aa39ef0eaccc941d62aa35e6902832ae Co-authored-by: Owl Bot * chore(deps): update dependency google-cloud-pubsub to v2.10.0 (#346) * chore(deps): update all dependencies (#347) * chore(deps): update dependency google-cloud-storage to v2.2.0 (#350) * chore(deps): update all dependencies (#351) * chore(python): use black==22.3.0 (#359) Source-Link: https://github.com/googleapis/synthtool/commit/6fab84af09f2cf89a031fd8671d1def6b2931b11 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:7cffbc10910c3ab1b852c05114a08d374c195a81cdec1d4a67a1d129331d0bfe * chore(deps): update dependency google-cloud-bigquery to v3 (#360) * chore(deps): update dependency google-cloud-pubsub to v2.12.0 (#369) * chore(deps): update dependency google-cloud-storage to v2.3.0 (#373) * chore: use gapic-generator-python 0.65.1 (#374) * chore: use gapic-generator-python 0.65.1 PiperOrigin-RevId: 441524537 Source-Link: https://github.com/googleapis/googleapis/commit/2a273915b3f70fe86c9d2a75470a0b83e48d0abf Source-Link: https://github.com/googleapis/googleapis-gen/commit/ab6756a48c89b5bcb9fb73443cb8e55d574f4643 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYWI2NzU2YTQ4Yzg5YjViY2I5ZmI3MzQ0M2NiOGU1NWQ1NzRmNDY0MyJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(python): use ubuntu 22.04 in docs image (#377) Source-Link: https://github.com/googleapis/synthtool/commit/f15cc72fb401b4861cedebb10af74afe428fb1f8 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:bc5eed3804aec2f05fad42aacf973821d9500c174015341f721a984a0825b6fd * chore(deps): update dependency pytest to v7.1.2 (#378) * chore(deps): update dependency backoff to v2 (#379) * chore(deps): update dependency backoff to v2.0.1 (#381) * chore(deps): update dependency google-cloud-bigquery to v3.1.0 (#385) * chore(deps): update dependency google-cloud-pubsub to v2.12.1 (#386) * chore(deps): update all dependencies (#388) * chore(deps): update dependency google-cloud-datastore to v2.6.1 (#394) * fix: require python 3.7+ (#411) * chore(python): drop python 3.6 Source-Link: https://github.com/googleapis/synthtool/commit/4f89b13af10d086458f9b379e56a614f9d6dab7b Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:e7bb19d47c13839fe8c147e50e02e8b6cf5da8edd1af8b82208cd6f66cc2829c * add api_description to .repo-metadata.json * require python 3.7+ in setup.py * remove python 3.6 sample configs * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * trigger CI Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#400) * chore(deps): update all dependencies * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * revert Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#416) * chore(deps): update all dependencies * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * revert Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#418) * chore(deps): update all dependencies * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * revert Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#419) [![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Type | Update | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---|---|---| | [actions/setup-python](https://togithub.com/actions/setup-python) | action | major | `v3` -> `v4` | [![age](https://badges.renovateapi.com/packages/github-tags/actions%2fsetup-python/v4/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/github-tags/actions%2fsetup-python/v4/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/github-tags/actions%2fsetup-python/v4/compatibility-slim/v3)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/github-tags/actions%2fsetup-python/v4/confidence-slim/v3)](https://docs.renovatebot.com/merge-confidence/) | | [google-cloud-bigquery](https://togithub.com/googleapis/python-bigquery) | | patch | `==3.3.0` -> `==3.3.1` | [![age](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/3.3.1/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/3.3.1/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/3.3.1/compatibility-slim/3.3.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/google-cloud-bigquery/3.3.1/confidence-slim/3.3.0)](https://docs.renovatebot.com/merge-confidence/) | | [protobuf](https://developers.google.com/protocol-buffers/) | | major | `>= 3.19.0, <4.0.0dev` -> `>=4.21.4, <4.22.0` | [![age](https://badges.renovateapi.com/packages/pypi/protobuf/4.21.4/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/pypi/protobuf/4.21.4/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/pypi/protobuf/4.21.4/compatibility-slim/3.20.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/pypi/protobuf/4.21.4/confidence-slim/3.20.1)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
actions/setup-python ### [`v4`](https://togithub.com/actions/setup-python/compare/v3...v4) [Compare Source](https://togithub.com/actions/setup-python/compare/v3...v4)
googleapis/python-bigquery ### [`v3.3.1`](https://togithub.com/googleapis/python-bigquery/blob/HEAD/CHANGELOG.md#​331-httpsgithubcomgoogleapispython-bigquerycomparev330v331-2022-08-09) ##### Bug Fixes - **deps:** allow pyarrow < 10 ([#​1304](https://togithub.com/googleapis/python-bigquery/issues/1304)) ([13616a9](https://togithub.com/googleapis/python-bigquery/commit/13616a910ba2e9b7bc3595847229b56e70c99f84))
--- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://togithub.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] If you want to rebase/retry this PR, click this checkbox. --- This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/python-dlp). * chore(deps): update dependency google-cloud-pubsub to v2.13.5 (#421) * chore(deps): update dependency google-cloud-pubsub to v2.13.6 (#424) * chore(deps): update all dependencies (#426) * chore(deps): update dependency google-cloud-bigquery to v3.3.2 (#427) * chore: Bump gapic-generator-python version to 1.3.0 (#440) - [ ] Regenerate this pull request now. PiperOrigin-RevId: 472561635 Source-Link: https://github.com/googleapis/googleapis/commit/332ecf599f8e747d8d1213b77ae7db26eff12814 Source-Link: https://github.com/googleapis/googleapis-gen/commit/4313d682880fd9d7247291164d4e9d3d5bd9f177 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNDMxM2Q2ODI4ODBmZDlkNzI0NzI5MTE2NGQ0ZTlkM2Q1YmQ5ZjE3NyJ9 * chore(deps): update all dependencies (#437) Co-authored-by: Anthonios Partheniou * chore: detect samples tests in nested directories (#443) Source-Link: https://github.com/googleapis/synthtool/commit/50db768f450a50d7c1fd62513c113c9bb96fd434 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:e09366bdf0fd9c8976592988390b24d53583dd9f002d476934da43725adbb978 * chore(deps): update all dependencies (#444) Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#449) * chore(deps): update dependency backoff to v2.2.1 (#450) * chore(deps): update all dependencies (#453) * chore(deps): update dependency google-cloud-datastore to v2.9.0 (#454) * chore(deps): update dependency pytest to v7.2.0 (#455) * chore(deps): update all dependencies (#457) * chore(python): drop flake8-import-order in samples noxfile [autoapprove] (#461) Source-Link: https://togithub.com/googleapis/synthtool/commit/6ed3a831cb9ff69ef8a504c353e098ec0192ad93 Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:3abfa0f1886adaf0b83f07cb117b24a639ea1cb9cffe56d43280b977033563eb * chore: Update gapic-generator-python to v1.6.1 (#456) * chore: update to gapic-generator-python 1.5.0 feat: add support for `google.cloud..__version__` PiperOrigin-RevId: 484665853 Source-Link: https://github.com/googleapis/googleapis/commit/8eb249a19db926c2fbc4ecf1dc09c0e521a88b22 Source-Link: https://github.com/googleapis/googleapis-gen/commit/c8aa327b5f478865fc3fd91e3c2768e54e26ad44 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYzhhYTMyN2I1ZjQ3ODg2NWZjM2ZkOTFlM2MyNzY4ZTU0ZTI2YWQ0NCJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * update version in gapic_version.py * add .release-please-manifest.json with correct version * add owlbot.py to exclude generated gapic_version.py * set manifest to true in .github/release-please.yml * add release-please-config.json * chore: Update to gapic-generator-python 1.6.0 feat(python): Add typing to proto.Message based class attributes feat(python): Snippetgen handling of repeated enum field PiperOrigin-RevId: 487326846 Source-Link: https://github.com/googleapis/googleapis/commit/da380c77bb87ba0f752baf07605dd1db30e1f7e1 Source-Link: https://github.com/googleapis/googleapis-gen/commit/61ef5762ee6731a0cbbfea22fd0eecee51ab1c8e Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNjFlZjU3NjJlZTY3MzFhMGNiYmZlYTIyZmQwZWVjZWU1MWFiMWM4ZSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * feat: new APIs added to reflect updates to the filestore service - Add ENTERPRISE Tier - Add snapshot APIs: RevertInstance, ListSnapshots, CreateSnapshot, DeleteSnapshot, UpdateSnapshot - Add multi-share APIs: ListShares, GetShare, CreateShare, DeleteShare, UpdateShare - Add ConnectMode to NetworkConfig (for Private Service Access support) - New status codes (SUSPENDED/SUSPENDING, REVERTING/RESUMING) - Add SuspensionReason (for KMS related suspension) - Add new fields to Instance information: max_capacity_gb, capacity_step_size_gb, max_share_count, capacity_gb, multi_share_enabled PiperOrigin-RevId: 487492758 Source-Link: https://github.com/googleapis/googleapis/commit/5be5981f50322cf0c7388595e0f31ac5d0693469 Source-Link: https://github.com/googleapis/googleapis-gen/commit/ab0e217f560cc2c1afc11441c2eab6b6950efd2b Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYWIwZTIxN2Y1NjBjYzJjMWFmYzExNDQxYzJlYWI2YjY5NTBlZmQyYiJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * feat: ExcludeByHotword added as an ExclusionRule type, NEW_ZEALAND added as a LocationCategory value PiperOrigin-RevId: 487581128 Source-Link: https://github.com/googleapis/googleapis/commit/9140e5546470d945fc741f27707aa68f562088f0 Source-Link: https://github.com/googleapis/googleapis-gen/commit/502b50e61710bca3d774cb918314cb1ef39e6fe9 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNTAyYjUwZTYxNzEwYmNhM2Q3NzRjYjkxODMxNGNiMWVmMzllNmZlOSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * update path to snippet metadata json * chore: Update gapic-generator-python to v1.6.1 PiperOrigin-RevId: 488036204 Source-Link: https://github.com/googleapis/googleapis/commit/08f275f5c1c0d99056e1cb68376323414459ee19 Source-Link: https://github.com/googleapis/googleapis-gen/commit/555c0945e60649e38739ae64bc45719cdf72178f Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNTU1YzA5NDVlNjA2NDllMzg3MzlhZTY0YmM0NTcxOWNkZjcyMTc4ZiJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#463) Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-bigquery to v3.4.1 (#466) * chore(main): release 3.10.0 (#462) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-dlp to v3.10.0 (#467) * docs(samples): Adding a missing line as suggested by feedback (#469) * chore(main): release 3.10.1 (#470) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * chore(deps): update all dependencies (#468) Co-authored-by: Anthonios Partheniou * chore(deps): update dependency google-cloud-dlp to v3.10.1 (#471) * chore(python): add support for python 3.11 (#472) Source-Link: https://github.com/googleapis/synthtool/commit/7197a001ffb6d8ce7b0b9b11c280f0c536c1033a Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:c43f1d918bcf817d337aa29ff833439494a158a0831508fda4ec75dc4c0d0320 Co-authored-by: Owl Bot * chore(deps): update dependency mock to v5.0.1 (#473) * feat: Add support for python 3.11 (#474) * feat: Add support for python 3.11 chore: Update gapic-generator-python to v1.8.0 PiperOrigin-RevId: 500768693 Source-Link: https://github.com/googleapis/googleapis/commit/190b612e3d0ff8f025875a669e5d68a1446d43c1 Source-Link: https://github.com/googleapis/googleapis-gen/commit/7bf29a414b9ecac3170f0b65bdc2a95705c0ef1a Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiN2JmMjlhNDE0YjllY2FjMzE3MGYwYjY1YmRjMmE5NTcwNWMwZWYxYSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md Co-authored-by: Owl Bot * chore(main): release 3.11.0 (#475) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * chore(deps): update all dependencies (#476) * chore(deps): update dependency pytest to v7.2.1 (#477) * chore(deps): update dependency google-cloud-datastore to v2.13.0 (#478) * chore(deps): update dependency google-cloud-bigquery to v3.4.2 (#479) * docs: Add documentation for enums (#481) * docs: Add documentation for enums fix: Add context manager return types chore: Update gapic-generator-python to v1.8.1 PiperOrigin-RevId: 503210727 Source-Link: https://github.com/googleapis/googleapis/commit/a391fd1dac18dfdfa00c18c8404f2c3a6ff8e98e Source-Link: https://github.com/googleapis/googleapis-gen/commit/0080f830dec37c3384157082bce279e37079ea58 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiMDA4MGY4MzBkZWMzN2MzMzg0MTU3MDgyYmNlMjc5ZTM3MDc5ZWE1OCJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * work around docs issue Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(main): release 3.11.1 (#482) Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> * chore: Update gapic-generator-python to v1.8.2 (#483) * chore: Update gapic-generator-python to v1.8.2 PiperOrigin-RevId: 504289125 Source-Link: https://github.com/googleapis/googleapis/commit/38a48a44a44279e9cf9f2f864b588958a2d87491 Source-Link: https://github.com/googleapis/googleapis-gen/commit/b2dc22663dbe47a972c8d8c2f8a4df013dafdcbc Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYjJkYzIyNjYzZGJlNDdhOTcyYzhkOGMyZjhhNGRmMDEzZGFmZGNiYyJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * revert Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou * chore(deps): update all dependencies (#480) * chore(deps): update dependency google-cloud-bigquery to v3.5.0 (#485) * remove generated samples * adding noxconfig file * updated header files and noxconfig to include py2.7 * set enforce type hints to False * lint fix * delete dlp jobs created in snippets * replaced cancel operation with delete job * increase timeout * skip bigquery test * int fix * updated acc to review comments * add Py27 to ignored versions * Updating test project * updated to use a single project --------- Co-authored-by: arithmetic1728 <58957152+arithmetic1728@users.noreply.github.com> Co-authored-by: Andrew Gorcester Co-authored-by: DPE bot Co-authored-by: chenyumic Co-authored-by: Frank Natividad Co-authored-by: Mike DaCosta Co-authored-by: michaelawyu Co-authored-by: mwdaub Co-authored-by: realjordanna <32629229+realjordanna@users.noreply.github.com> Co-authored-by: Ace Co-authored-by: djmailhot Co-authored-by: Charles Engelke Co-authored-by: Maximus Co-authored-by: Averi Kitsch Co-authored-by: Gus Class Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com> Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com> Co-authored-by: WhiteSource Renovate Co-authored-by: Leah Cole Co-authored-by: Takashi Matsuo Co-authored-by: gcf-merge-on-green[bot] <60162190+gcf-merge-on-green[bot]@users.noreply.github.com> Co-authored-by: Bu Sun Kim <8822365+busunkim96@users.noreply.github.com> Co-authored-by: Seth Moore Co-authored-by: Ace Co-authored-by: Seth Moore Co-authored-by: jlmwise <66651702+jlmwise@users.noreply.github.com> Co-authored-by: Xiaohua (Victor) Liang Co-authored-by: Xiaohua (Victor) Liang Co-authored-by: Charles Engelke Co-authored-by: Chris Wilson <46912004+sushicw@users.noreply.github.com> Co-authored-by: Yoshi Automation Bot Co-authored-by: Hil Liao Co-authored-by: gcf-owl-bot[bot] <78513119+gcf-owl-bot[bot]@users.noreply.github.com> Co-authored-by: Owl Bot Co-authored-by: Anthonios Partheniou Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> Co-authored-by: Maciej Strzelczyk Co-authored-by: Remigiusz Samborski --- .github/header-checker-lint.yml | 3 + dlp/AUTHORING_GUIDE.md | 1 + dlp/CONTRIBUTING.md | 1 + dlp/README.md | 3 - dlp/snippets/custom_infotype.py | 873 ++++++++++++++++ dlp/snippets/custom_infotype_test.py | 162 +++ dlp/snippets/deid.py | 1228 ++++++++++++++++++++++ dlp/snippets/deid_test.py | 291 ++++++ dlp/snippets/inspect_content.py | 1438 ++++++++++++++++++++++++++ dlp/snippets/inspect_content_test.py | 504 +++++++++ dlp/snippets/jobs.py | 160 +++ dlp/snippets/jobs_test.py | 91 ++ dlp/snippets/metadata.py | 72 ++ dlp/snippets/metadata_test.py | 22 + dlp/snippets/noxfile_config.py | 42 + dlp/snippets/quickstart.py | 92 ++ dlp/snippets/quickstart_test.py | 27 + dlp/snippets/redact.py | 259 +++++ dlp/snippets/redact_test.py | 60 ++ dlp/snippets/requirements-test.txt | 4 + dlp/snippets/requirements.txt | 5 + dlp/snippets/resources/accounts.txt | 1 + dlp/snippets/resources/dates.csv | 5 + dlp/snippets/resources/harmless.txt | 1 + dlp/snippets/resources/test.png | Bin 0 -> 21438 bytes dlp/snippets/resources/test.txt | 1 + dlp/snippets/risk.py | 939 +++++++++++++++++ dlp/snippets/risk_test.py | 398 +++++++ dlp/snippets/templates.py | 255 +++++ dlp/snippets/templates_test.py | 60 ++ dlp/snippets/triggers.py | 286 +++++ dlp/snippets/triggers_test.py | 102 ++ 32 files changed, 7383 insertions(+), 3 deletions(-) create mode 100644 dlp/AUTHORING_GUIDE.md create mode 100644 dlp/CONTRIBUTING.md delete mode 100644 dlp/README.md create mode 100644 dlp/snippets/custom_infotype.py create mode 100644 dlp/snippets/custom_infotype_test.py create mode 100644 dlp/snippets/deid.py create mode 100644 dlp/snippets/deid_test.py create mode 100644 dlp/snippets/inspect_content.py create mode 100644 dlp/snippets/inspect_content_test.py create mode 100644 dlp/snippets/jobs.py create mode 100644 dlp/snippets/jobs_test.py create mode 100644 dlp/snippets/metadata.py create mode 100644 dlp/snippets/metadata_test.py create mode 100644 dlp/snippets/noxfile_config.py create mode 100644 dlp/snippets/quickstart.py create mode 100644 dlp/snippets/quickstart_test.py create mode 100644 dlp/snippets/redact.py create mode 100644 dlp/snippets/redact_test.py create mode 100644 dlp/snippets/requirements-test.txt create mode 100644 dlp/snippets/requirements.txt create mode 100644 dlp/snippets/resources/accounts.txt create mode 100644 dlp/snippets/resources/dates.csv create mode 100644 dlp/snippets/resources/harmless.txt create mode 100644 dlp/snippets/resources/test.png create mode 100644 dlp/snippets/resources/test.txt create mode 100644 dlp/snippets/risk.py create mode 100644 dlp/snippets/risk_test.py create mode 100644 dlp/snippets/templates.py create mode 100644 dlp/snippets/templates_test.py create mode 100644 dlp/snippets/triggers.py create mode 100644 dlp/snippets/triggers_test.py diff --git a/.github/header-checker-lint.yml b/.github/header-checker-lint.yml index 0dcaff7d0df6..8e74f16e8cef 100644 --- a/.github/header-checker-lint.yml +++ b/.github/header-checker-lint.yml @@ -20,6 +20,9 @@ ignoreFiles: - "texttospeech/snippets/resources/hello.txt" - "language/**/resources/*.txt" - "language/snippets/classify_text/resources/texts/*.txt" + - "dlp/snippets/resources/accounts.txt" + - "dlp/snippets/resources/harmless.txt" + - "dlp/snippets/resources/test.txt" ignoreLicenseYear: true diff --git a/dlp/AUTHORING_GUIDE.md b/dlp/AUTHORING_GUIDE.md new file mode 100644 index 000000000000..55c97b32f4c1 --- /dev/null +++ b/dlp/AUTHORING_GUIDE.md @@ -0,0 +1 @@ +See https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/AUTHORING_GUIDE.md \ No newline at end of file diff --git a/dlp/CONTRIBUTING.md b/dlp/CONTRIBUTING.md new file mode 100644 index 000000000000..34c882b6f1a3 --- /dev/null +++ b/dlp/CONTRIBUTING.md @@ -0,0 +1 @@ +See https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/CONTRIBUTING.md \ No newline at end of file diff --git a/dlp/README.md b/dlp/README.md deleted file mode 100644 index df1718bc765b..000000000000 --- a/dlp/README.md +++ /dev/null @@ -1,3 +0,0 @@ -These samples have been moved. - -https://github.com/googleapis/python-dlp/tree/main/samples diff --git a/dlp/snippets/custom_infotype.py b/dlp/snippets/custom_infotype.py new file mode 100644 index 000000000000..4152b233ebe4 --- /dev/null +++ b/dlp/snippets/custom_infotype.py @@ -0,0 +1,873 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Custom infoType snippets. + +This file contains sample code that uses the Data Loss Prevention API to create +custom infoType detectors to refine scan results. +""" + + +# [START dlp_inspect_string_with_exclusion_dict] +def inspect_string_with_exclusion_dict( + project, content_string, exclusion_list=["example@example.com"] +): + """Inspects the provided text, avoiding matches specified in the exclusion list + + Uses the Data Loss Prevention API to omit matches on EMAIL_ADDRESS if they are + in the specified exclusion list. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + exclusion_list: The list of strings to ignore matches on + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a list of infoTypes for DLP to locate in `content_string`. See + # https://cloud.google.com/dlp/docs/concepts-infotypes for more information + # about supported infoTypes. + info_types_to_locate = [{"name": "EMAIL_ADDRESS"}] + + # Construct a rule set that will only match on EMAIL_ADDRESS + # if the match text is not in the exclusion list. + rule_set = [ + { + "info_types": info_types_to_locate, + "rules": [ + { + "exclusion_rule": { + "dictionary": {"word_list": {"words": exclusion_list}}, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_FULL_MATCH, + } + } + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "info_types": info_types_to_locate, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_with_exclusion_dict] + + +# [START dlp_inspect_string_with_exclusion_regex] +def inspect_string_with_exclusion_regex( + project, content_string, exclusion_regex=".+@example.com" +): + """Inspects the provided text, avoiding matches specified in the exclusion regex + + Uses the Data Loss Prevention API to omit matches on EMAIL_ADDRESS if they match + the specified exclusion regex. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + exclusion_regex: The regular expression to exclude matches on + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a list of infoTypes for DLP to locate in `content_string`. See + # https://cloud.google.com/dlp/docs/concepts-infotypes for more information + # about supported infoTypes. + info_types_to_locate = [{"name": "EMAIL_ADDRESS"}] + + # Construct a rule set that will only match on EMAIL_ADDRESS + # if the specified regex doesn't also match. + rule_set = [ + { + "info_types": info_types_to_locate, + "rules": [ + { + "exclusion_rule": { + "regex": {"pattern": exclusion_regex}, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_FULL_MATCH, + } + } + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "info_types": info_types_to_locate, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_with_exclusion_regex] + + +# [START dlp_inspect_string_with_exclusion_dict_substring] +def inspect_string_with_exclusion_dict_substring( + project, content_string, exclusion_list=["TEST"] +): + """Inspects the provided text, avoiding matches that contain excluded tokens + + Uses the Data Loss Prevention API to omit matches if they include tokens + in the specified exclusion list. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + exclusion_list: The list of strings to ignore partial matches on + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a list of infoTypes for DLP to locate in `content_string`. See + # https://cloud.google.com/dlp/docs/concepts-infotypes for more information + # about supported infoTypes. + info_types_to_locate = [{"name": "EMAIL_ADDRESS"}, {"name": "DOMAIN_NAME"}] + + # Construct a rule set that will only match if the match text does not + # contains tokens from the exclusion list. + rule_set = [ + { + "info_types": info_types_to_locate, + "rules": [ + { + "exclusion_rule": { + "dictionary": {"word_list": {"words": exclusion_list}}, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH, + } + } + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "info_types": info_types_to_locate, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_with_exclusion_dict_substring] + + +# [START dlp_inspect_string_custom_excluding_substring] +def inspect_string_custom_excluding_substring( + project, content_string, exclusion_list=["jimmy"] +): + """Inspects the provided text with a custom detector, avoiding matches on specific tokens + + Uses the Data Loss Prevention API to omit matches on a custom detector + if they include tokens in the specified exclusion list. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + exclusion_list: The list of strings to ignore matches on + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a custom regex detector for names + custom_info_types = [ + { + "info_type": {"name": "CUSTOM_NAME_DETECTOR"}, + "regex": {"pattern": "[A-Z][a-z]{1,15}, [A-Z][a-z]{1,15}"}, + } + ] + + # Construct a rule set that will only match if the match text does not + # contains tokens from the exclusion list. + rule_set = [ + { + "info_types": [{"name": "CUSTOM_NAME_DETECTOR"}], + "rules": [ + { + "exclusion_rule": { + "dictionary": {"word_list": {"words": exclusion_list}}, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH, + } + } + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "custom_info_types": custom_info_types, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_custom_excluding_substring] + + +# [START dlp_inspect_string_custom_omit_overlap] +def inspect_string_custom_omit_overlap(project, content_string): + """Matches PERSON_NAME and a custom detector, + but if they overlap only matches the custom detector + + Uses the Data Loss Prevention API to omit matches on a built-in detector + if they overlap with matches from a custom detector + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a custom regex detector for names + custom_info_types = [ + { + "info_type": {"name": "VIP_DETECTOR"}, + "regex": {"pattern": "Larry Page|Sergey Brin"}, + "exclusion_type": google.cloud.dlp_v2.CustomInfoType.ExclusionType.EXCLUSION_TYPE_EXCLUDE, + } + ] + + # Construct a rule set that will exclude PERSON_NAME matches + # that overlap with VIP_DETECTOR matches + rule_set = [ + { + "info_types": [{"name": "PERSON_NAME"}], + "rules": [ + { + "exclusion_rule": { + "exclude_info_types": { + "info_types": [{"name": "VIP_DETECTOR"}] + }, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_FULL_MATCH, + } + } + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "info_types": [{"name": "PERSON_NAME"}], + "custom_info_types": custom_info_types, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_custom_omit_overlap] + + +# [START dlp_omit_name_if_also_email] +def omit_name_if_also_email( + project, + content_string, +): + """Matches PERSON_NAME and EMAIL_ADDRESS, but not both. + + Uses the Data Loss Prevention API omit matches on PERSON_NAME if the + EMAIL_ADDRESS detector also matches. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a list of infoTypes for DLP to locate in `content_string`. See + # https://cloud.google.com/dlp/docs/concepts-infotypes for more information + # about supported infoTypes. + info_types_to_locate = [{"name": "PERSON_NAME"}, {"name": "EMAIL_ADDRESS"}] + + # Construct the configuration dictionary that will only match on PERSON_NAME + # if the EMAIL_ADDRESS doesn't also match. This configuration helps reduce + # the total number of findings when there is a large overlap between different + # infoTypes. + inspect_config = { + "info_types": info_types_to_locate, + "rule_set": [ + { + "info_types": [{"name": "PERSON_NAME"}], + "rules": [ + { + "exclusion_rule": { + "exclude_info_types": { + "info_types": [{"name": "EMAIL_ADDRESS"}] + }, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH, + } + } + ], + } + ], + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_omit_name_if_also_email] + + +# [START dlp_inspect_string_without_overlap] +def inspect_string_without_overlap(project, content_string): + """Matches EMAIL_ADDRESS and DOMAIN_NAME, but DOMAIN_NAME is omitted + if it overlaps with EMAIL_ADDRESS + + Uses the Data Loss Prevention API to omit matches of one infotype + that overlap with another. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a list of infoTypes for DLP to locate in `content_string`. See + # https://cloud.google.com/dlp/docs/concepts-infotypes for more information + # about supported infoTypes. + info_types_to_locate = [{"name": "DOMAIN_NAME"}, {"name": "EMAIL_ADDRESS"}] + + # Define a custom info type to exclude email addresses + custom_info_types = [ + { + "info_type": {"name": "EMAIL_ADDRESS"}, + "exclusion_type": google.cloud.dlp_v2.CustomInfoType.ExclusionType.EXCLUSION_TYPE_EXCLUDE, + } + ] + + # Construct a rule set that will exclude DOMAIN_NAME matches + # that overlap with EMAIL_ADDRESS matches + rule_set = [ + { + "info_types": [{"name": "DOMAIN_NAME"}], + "rules": [ + { + "exclusion_rule": { + "exclude_info_types": { + "info_types": [{"name": "EMAIL_ADDRESS"}] + }, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH, + } + } + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "info_types": info_types_to_locate, + "custom_info_types": custom_info_types, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_without_overlap] + + +# [START dlp_inspect_with_person_name_w_custom_hotword] +def inspect_with_person_name_w_custom_hotword( + project, content_string, custom_hotword="patient" +): + """Uses the Data Loss Prevention API increase likelihood for matches on + PERSON_NAME if the user specified custom hotword is present. Only + includes findings with the increased likelihood by setting a minimum + likelihood threshold of VERY_LIKELY. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + custom_hotword: The custom hotword used for likelihood boosting. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a rule set with caller provided hotword, with a likelihood + # boost to VERY_LIKELY when the hotword are present within the 50 character- + # window preceding the PII finding. + hotword_rule = { + "hotword_regex": {"pattern": custom_hotword}, + "likelihood_adjustment": { + "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY + }, + "proximity": {"window_before": 50}, + } + + rule_set = [ + { + "info_types": [{"name": "PERSON_NAME"}], + "rules": [{"hotword_rule": hotword_rule}], + } + ] + + # Construct the configuration dictionary with the custom regex info type. + inspect_config = { + "rule_set": rule_set, + "min_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_with_person_name_w_custom_hotword] + + +# [START dlp_inspect_string_multiple_rules] +def inspect_string_multiple_rules(project, content_string): + """Uses the Data Loss Prevention API to modify likelihood for matches on + PERSON_NAME combining multiple hotword and exclusion rules. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct hotword rules + patient_rule = { + "hotword_regex": {"pattern": "patient"}, + "proximity": {"window_before": 10}, + "likelihood_adjustment": { + "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY + }, + } + doctor_rule = { + "hotword_regex": {"pattern": "doctor"}, + "proximity": {"window_before": 10}, + "likelihood_adjustment": { + "fixed_likelihood": google.cloud.dlp_v2.Likelihood.UNLIKELY + }, + } + + # Construct exclusion rules + quasimodo_rule = { + "dictionary": {"word_list": {"words": ["quasimodo"]}}, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH, + } + redacted_rule = { + "regex": {"pattern": "REDACTED"}, + "matching_type": google.cloud.dlp_v2.MatchingType.MATCHING_TYPE_PARTIAL_MATCH, + } + + # Construct the rule set, combining the above rules + rule_set = [ + { + "info_types": [{"name": "PERSON_NAME"}], + "rules": [ + {"hotword_rule": patient_rule}, + {"hotword_rule": doctor_rule}, + {"exclusion_rule": quasimodo_rule}, + {"exclusion_rule": redacted_rule}, + ], + } + ] + + # Construct the configuration dictionary + inspect_config = { + "info_types": [{"name": "PERSON_NAME"}], + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_string_multiple_rules] + + +# [START dlp_inspect_with_medical_record_number_custom_regex_detector] +def inspect_with_medical_record_number_custom_regex_detector( + project, + content_string, +): + """Uses the Data Loss Prevention API to analyze string with medical record + number custom regex detector + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a custom regex detector info type called "C_MRN", + # with ###-#-##### pattern, where each # represents a digit from 1 to 9. + # The detector has a detection likelihood of POSSIBLE. + custom_info_types = [ + { + "info_type": {"name": "C_MRN"}, + "regex": {"pattern": "[1-9]{3}-[1-9]{1}-[1-9]{5}"}, + "likelihood": google.cloud.dlp_v2.Likelihood.POSSIBLE, + } + ] + + # Construct the configuration dictionary with the custom regex info type. + inspect_config = { + "custom_info_types": custom_info_types, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_with_medical_record_number_custom_regex_detector] + + +# [START dlp_inspect_with_medical_record_number_w_custom_hotwords] +def inspect_with_medical_record_number_w_custom_hotwords( + project, + content_string, +): + """Uses the Data Loss Prevention API to analyze string with medical record + number custom regex detector, with custom hotwords rules to boost finding + certainty under some circumstances. + + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct a custom regex detector info type called "C_MRN", + # with ###-#-##### pattern, where each # represents a digit from 1 to 9. + # The detector has a detection likelihood of POSSIBLE. + custom_info_types = [ + { + "info_type": {"name": "C_MRN"}, + "regex": {"pattern": "[1-9]{3}-[1-9]{1}-[1-9]{5}"}, + "likelihood": google.cloud.dlp_v2.Likelihood.POSSIBLE, + } + ] + + # Construct a rule set with hotwords "mrn" and "medical", with a likelohood + # boost to VERY_LIKELY when hotwords are present within the 10 character- + # window preceding the PII finding. + hotword_rule = { + "hotword_regex": {"pattern": "(?i)(mrn|medical)(?-i)"}, + "likelihood_adjustment": { + "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY + }, + "proximity": {"window_before": 10}, + } + + rule_set = [ + {"info_types": [{"name": "C_MRN"}], "rules": [{"hotword_rule": hotword_rule}]} + ] + + # Construct the configuration dictionary with the custom regex info type. + inspect_config = { + "custom_info_types": custom_info_types, + "rule_set": rule_set, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print(f"Quote: {finding.quote}") + print(f"Info type: {finding.info_type.name}") + print(f"Likelihood: {finding.likelihood}") + else: + print("No findings.") + + +# [END dlp_inspect_with_medical_record_number_w_custom_hotwords] diff --git a/dlp/snippets/custom_infotype_test.py b/dlp/snippets/custom_infotype_test.py new file mode 100644 index 000000000000..13c5e3275427 --- /dev/null +++ b/dlp/snippets/custom_infotype_test.py @@ -0,0 +1,162 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +import custom_infotype + +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") + + +def test_inspect_string_with_exclusion_dict(capsys): + custom_infotype.inspect_string_with_exclusion_dict( + GCLOUD_PROJECT, "gary@example.com, example@example.com", ["example@example.com"] + ) + + out, _ = capsys.readouterr() + assert "example@example.com" not in out + assert "gary@example.com" in out + + +def test_inspect_string_with_exclusion_regex(capsys): + custom_infotype.inspect_string_with_exclusion_regex( + GCLOUD_PROJECT, "alice@example.com, ironman@avengers.net", ".+@example.com" + ) + + out, _ = capsys.readouterr() + assert "alice" not in out + assert "ironman" in out + + +def test_inspect_string_with_exclusion_dict_substring(capsys): + custom_infotype.inspect_string_with_exclusion_dict_substring( + GCLOUD_PROJECT, "bob@example.com TEST@example.com TEST.com", ["TEST"] + ) + + out, _ = capsys.readouterr() + assert "TEST@example.com" not in out + assert "TEST.com" not in out + assert "bob@example.com" in out + + +def test_inspect_string_custom_excluding_substring(capsys): + custom_infotype.inspect_string_custom_excluding_substring( + GCLOUD_PROJECT, "Danger, Jimmy | Wayne, Bruce", ["Jimmy"] + ) + + out, _ = capsys.readouterr() + assert "Wayne, Bruce" in out + assert "Danger, Jimmy" not in out + + +def test_inspect_string_custom_omit_overlap(capsys): + custom_infotype.inspect_string_custom_omit_overlap( + GCLOUD_PROJECT, "Larry Page and John Doe" + ) + + out, _ = capsys.readouterr() + assert "Larry Page" not in out + assert "John Doe" in out + + +def test_omit_name_if_also_email(capsys): + custom_infotype.omit_name_if_also_email(GCLOUD_PROJECT, "alice@example.com") + + # Ensure we found only EMAIL_ADDRESS, and not PERSON_NAME. + out, _ = capsys.readouterr() + assert "Info type: EMAIL_ADDRESS" in out + assert "Info type: PERSON_NAME" not in out + + +def test_inspect_string_without_overlap(capsys): + custom_infotype.inspect_string_without_overlap( + GCLOUD_PROJECT, "example.com is a domain, james@example.org is an email." + ) + + out, _ = capsys.readouterr() + assert "example.com" in out + assert "example.org" not in out + + +def test_inspect_with_person_name_w_custom_hotword(capsys): + custom_infotype.inspect_with_person_name_w_custom_hotword( + GCLOUD_PROJECT, "patient's name is John Doe.", "patient" + ) + + out, _ = capsys.readouterr() + assert "Info type: PERSON_NAME" in out + assert "Likelihood: 5" in out + + +def test_inspect_string_multiple_rules_patient(capsys): + custom_infotype.inspect_string_multiple_rules( + GCLOUD_PROJECT, "patient name: Jane Doe" + ) + + out, _ = capsys.readouterr() + assert "Likelihood: 4" in out + + +def test_inspect_string_multiple_rules_doctor(capsys): + custom_infotype.inspect_string_multiple_rules(GCLOUD_PROJECT, "doctor: Jane Doe") + + out, _ = capsys.readouterr() + assert "No findings" in out + + +def test_inspect_string_multiple_rules_quasimodo(capsys): + custom_infotype.inspect_string_multiple_rules( + GCLOUD_PROJECT, "patient name: quasimodo" + ) + + out, _ = capsys.readouterr() + assert "No findings" in out + + +def test_inspect_string_multiple_rules_redacted(capsys): + custom_infotype.inspect_string_multiple_rules( + GCLOUD_PROJECT, "name of patient: REDACTED" + ) + + out, _ = capsys.readouterr() + assert "No findings" in out + + +def test_inspect_with_medical_record_number_custom_regex_detector(capsys): + custom_infotype.inspect_with_medical_record_number_custom_regex_detector( + GCLOUD_PROJECT, "Patients MRN 444-5-22222" + ) + + out, _ = capsys.readouterr() + assert "Info type: C_MRN" in out + + +def test_inspect_with_medical_record_number_w_custom_hotwords_no_hotwords(capsys): + custom_infotype.inspect_with_medical_record_number_w_custom_hotwords( + GCLOUD_PROJECT, "just a number 444-5-22222" + ) + + out, _ = capsys.readouterr() + assert "Info type: C_MRN" in out + assert "Likelihood: 3" in out + + +def test_inspect_with_medical_record_number_w_custom_hotwords_has_hotwords(capsys): + custom_infotype.inspect_with_medical_record_number_w_custom_hotwords( + GCLOUD_PROJECT, "Patients MRN 444-5-22222" + ) + + out, _ = capsys.readouterr() + assert "Info type: C_MRN" in out + assert "Likelihood: 5" in out diff --git a/dlp/snippets/deid.py b/dlp/snippets/deid.py new file mode 100644 index 000000000000..3e6968ff786b --- /dev/null +++ b/dlp/snippets/deid.py @@ -0,0 +1,1228 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Uses of the Data Loss Prevention API for deidentifying sensitive data.""" + +from __future__ import print_function + +import argparse + + +# [START dlp_deidentify_masking] +def deidentify_with_mask( + project, input_str, info_types, masking_character=None, number_to_mask=0 +): + """Uses the Data Loss Prevention API to deidentify sensitive data in a + string by masking it with a character. + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + masking_character: The character to mask matching sensitive data with. + number_to_mask: The maximum number of sensitive characters to mask in + a match. If omitted or set to zero, the API will default to no + maximum. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Construct inspect configuration dictionary + inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]} + + # Construct deidentify configuration dictionary + deidentify_config = { + "info_type_transformations": { + "transformations": [ + { + "primitive_transformation": { + "character_mask_config": { + "masking_character": masking_character, + "number_to_mask": number_to_mask, + } + } + } + ] + } + } + + # Construct item + item = {"value": input_str} + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print out the results. + print(response.item.value) + + +# [END dlp_deidentify_masking] + +# [START dlp_deidentify_redact] +def deidentify_with_redact( + project, + input_str, + info_types, +): + """Uses the Data Loss Prevention API to deidentify sensitive data in a + string by redacting matched input values. + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + info_types: A list of strings representing info types to look for. + Returns: + None; the response from the API is printed to the terminal. + """ + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Construct inspect configuration dictionary + inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]} + + # Construct deidentify configuration dictionary + deidentify_config = { + "info_type_transformations": { + "transformations": [{"primitive_transformation": {"redact_config": {}}}] + } + } + + # Construct item + item = {"value": input_str} + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print out the results. + print(response.item.value) + + +# [END dlp_deidentify_redact] + +# [START dlp_deidentify_replace] +def deidentify_with_replace( + project, + input_str, + info_types, + replacement_str="REPLACEMENT_STR", +): + """Uses the Data Loss Prevention API to deidentify sensitive data in a + string by replacing matched input values with a value you specify. + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + info_types: A list of strings representing info types to look for. + replacement_str: The string to replace all values that match given + info types. + Returns: + None; the response from the API is printed to the terminal. + """ + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Construct inspect configuration dictionary + inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]} + + # Construct deidentify configuration dictionary + deidentify_config = { + "info_type_transformations": { + "transformations": [ + { + "primitive_transformation": { + "replace_config": { + "new_value": {"string_value": replacement_str} + } + } + } + ] + } + } + + # Construct item + item = {"value": input_str} + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print out the results. + print(response.item.value) + + +# [END dlp_deidentify_replace] + +# [START dlp_deidentify_fpe] + + +def deidentify_with_fpe( + project, + input_str, + info_types, + alphabet=None, + surrogate_type=None, + key_name=None, + wrapped_key=None, +): + """Uses the Data Loss Prevention API to deidentify sensitive data in a + string using Format Preserving Encryption (FPE). + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + alphabet: The set of characters to replace sensitive ones with. For + more information, see https://cloud.google.com/dlp/docs/reference/ + rest/v2beta2/organizations.deidentifyTemplates#ffxcommonnativealphabet + surrogate_type: The name of the surrogate custom info type to use. Only + necessary if you want to reverse the deidentification process. Can + be essentially any arbitrary string, as long as it doesn't appear + in your dataset otherwise. + key_name: The name of the Cloud KMS key used to encrypt ('wrap') the + AES-256 key. Example: + key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/ + keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME' + wrapped_key: The encrypted ('wrapped') AES-256 key to use. This key + should be encrypted using the Cloud KMS key specified by key_name. + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # The wrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + import base64 + + wrapped_key = base64.b64decode(wrapped_key) + + # Construct FPE configuration dictionary + crypto_replace_ffx_fpe_config = { + "crypto_key": { + "kms_wrapped": {"wrapped_key": wrapped_key, "crypto_key_name": key_name} + }, + "common_alphabet": alphabet, + } + + # Add surrogate type + if surrogate_type: + crypto_replace_ffx_fpe_config["surrogate_info_type"] = {"name": surrogate_type} + + # Construct inspect configuration dictionary + inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]} + + # Construct deidentify configuration dictionary + deidentify_config = { + "info_type_transformations": { + "transformations": [ + { + "primitive_transformation": { + "crypto_replace_ffx_fpe_config": crypto_replace_ffx_fpe_config + } + } + ] + } + } + + # Convert string to item + item = {"value": input_str} + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print results + print(response.item.value) + + +# [END dlp_deidentify_fpe] + +# [START dlp_deidentify_deterministic] +def deidentify_with_deterministic( + project, + input_str, + info_types, + surrogate_type=None, + key_name=None, + wrapped_key=None, +): + """Deidentifies sensitive data in a string using deterministic encryption. + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + surrogate_type: The name of the surrogate custom info type to use. Only + necessary if you want to reverse the deidentification process. Can + be essentially any arbitrary string, as long as it doesn't appear + in your dataset otherwise. + key_name: The name of the Cloud KMS key used to encrypt ('wrap') the + AES-256 key. Example: + key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/ + keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME' + wrapped_key: The encrypted ('wrapped') AES-256 key to use. This key + should be encrypted using the Cloud KMS key specified by key_name. + Returns: + None; the response from the API is printed to the terminal. + """ + import base64 + + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # The wrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + wrapped_key = base64.b64decode(wrapped_key) + + # Construct Deterministic encryption configuration dictionary + crypto_replace_deterministic_config = { + "crypto_key": { + "kms_wrapped": {"wrapped_key": wrapped_key, "crypto_key_name": key_name} + }, + } + + # Add surrogate type + if surrogate_type: + crypto_replace_deterministic_config["surrogate_info_type"] = { + "name": surrogate_type + } + + # Construct inspect configuration dictionary + inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]} + + # Construct deidentify configuration dictionary + deidentify_config = { + "info_type_transformations": { + "transformations": [ + { + "primitive_transformation": { + "crypto_deterministic_config": crypto_replace_deterministic_config + } + } + ] + } + } + + # Convert string to item + item = {"value": input_str} + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print results + print(response.item.value) + + +# [END dlp_deidentify_deterministic] + + +# [START dlp_reidentify_fpe] +def reidentify_with_fpe( + project, + input_str, + alphabet=None, + surrogate_type=None, + key_name=None, + wrapped_key=None, +): + """Uses the Data Loss Prevention API to reidentify sensitive data in a + string that was encrypted by Format Preserving Encryption (FPE). + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + alphabet: The set of characters to replace sensitive ones with. For + more information, see https://cloud.google.com/dlp/docs/reference/ + rest/v2beta2/organizations.deidentifyTemplates#ffxcommonnativealphabet + surrogate_type: The name of the surrogate custom info type to used + during the encryption process. + key_name: The name of the Cloud KMS key used to encrypt ('wrap') the + AES-256 key. Example: + keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/ + keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME' + wrapped_key: The encrypted ('wrapped') AES-256 key to use. This key + should be encrypted using the Cloud KMS key specified by key_name. + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # The wrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + import base64 + + wrapped_key = base64.b64decode(wrapped_key) + + # Construct Deidentify Config + reidentify_config = { + "info_type_transformations": { + "transformations": [ + { + "primitive_transformation": { + "crypto_replace_ffx_fpe_config": { + "crypto_key": { + "kms_wrapped": { + "wrapped_key": wrapped_key, + "crypto_key_name": key_name, + } + }, + "common_alphabet": alphabet, + "surrogate_info_type": {"name": surrogate_type}, + } + } + } + ] + } + } + + inspect_config = { + "custom_info_types": [ + {"info_type": {"name": surrogate_type}, "surrogate_type": {}} + ] + } + + # Convert string to item + item = {"value": input_str} + + # Call the API + response = dlp.reidentify_content( + request={ + "parent": parent, + "reidentify_config": reidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print results + print(response.item.value) + + +# [END dlp_reidentify_fpe] + + +# [START dlp_reidentify_deterministic] +def reidentify_with_deterministic( + project, + input_str, + surrogate_type=None, + key_name=None, + wrapped_key=None, +): + """Re-identifies content that was previously de-identified through deterministic encryption. + Args: + project: The Google Cloud project ID to use as a parent resource. + input_str: The string to be re-identified. Provide the entire token. Example: + EMAIL_ADDRESS_TOKEN(52):AVAx2eIEnIQP5jbNEr2j9wLOAd5m4kpSBR/0jjjGdAOmryzZbE/q + surrogate_type: The name of the surrogate custom infoType used + during the encryption process. + key_name: The name of the Cloud KMS key used to encrypt ("wrap") the + AES-256 key. Example: + keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/ + keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME' + wrapped_key: The encrypted ("wrapped") AES-256 key previously used to encrypt the content. + This key must have been encrypted using the Cloud KMS key specified by key_name. + Returns: + None; the response from the API is printed to the terminal. + """ + import base64 + + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # The wrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + wrapped_key = base64.b64decode(wrapped_key) + + # Construct reidentify Configuration + reidentify_config = { + "info_type_transformations": { + "transformations": [ + { + "primitive_transformation": { + "crypto_deterministic_config": { + "crypto_key": { + "kms_wrapped": { + "wrapped_key": wrapped_key, + "crypto_key_name": key_name, + } + }, + "surrogate_info_type": {"name": surrogate_type}, + } + } + } + ] + } + } + + inspect_config = { + "custom_info_types": [ + {"info_type": {"name": surrogate_type}, "surrogate_type": {}} + ] + } + + # Convert string to item + item = {"value": input_str} + + # Call the API + response = dlp.reidentify_content( + request={ + "parent": parent, + "reidentify_config": reidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print results + print(response.item.value) + + +# [END dlp_reidentify_deterministic] + + +# [START dlp_deidentify_free_text_with_fpe_using_surrogate] +def deidentify_free_text_with_fpe_using_surrogate( + project, + input_str, + alphabet="NUMERIC", + info_type="PHONE_NUMBER", + surrogate_type="PHONE_TOKEN", + unwrapped_key="YWJjZGVmZ2hpamtsbW5vcA==", +): + """Uses the Data Loss Prevention API to deidentify sensitive data in a + string using Format Preserving Encryption (FPE). + The encryption is performed with an unwrapped key. + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + alphabet: The set of characters to replace sensitive ones with. For + more information, see https://cloud.google.com/dlp/docs/reference/ + rest/v2beta2/organizations.deidentifyTemplates#ffxcommonnativealphabet + info_type: The name of the info type to de-identify + surrogate_type: The name of the surrogate custom info type to use. Can + be essentially any arbitrary string, as long as it doesn't appear + in your dataset otherwise. + unwrapped_key: The base64-encoded AES-256 key to use. + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # The unwrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + import base64 + + unwrapped_key = base64.b64decode(unwrapped_key) + + # Construct de-identify config + transformation = { + "info_types": [{"name": info_type}], + "primitive_transformation": { + "crypto_replace_ffx_fpe_config": { + "crypto_key": {"unwrapped": {"key": unwrapped_key}}, + "common_alphabet": alphabet, + "surrogate_info_type": {"name": surrogate_type}, + } + }, + } + + deidentify_config = { + "info_type_transformations": {"transformations": [transformation]} + } + + # Construct the inspect config, trying to finding all PII with likelihood + # higher than UNLIKELY + inspect_config = { + "info_types": [{"name": info_type}], + "min_likelihood": google.cloud.dlp_v2.Likelihood.UNLIKELY, + } + + # Convert string to item + item = {"value": input_str} + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print results + print(response.item.value) + + +# [END dlp_deidentify_free_text_with_fpe_using_surrogate] + + +# [START dlp_reidentify_free_text_with_fpe_using_surrogate] +def reidentify_free_text_with_fpe_using_surrogate( + project, + input_str, + alphabet="NUMERIC", + surrogate_type="PHONE_TOKEN", + unwrapped_key="YWJjZGVmZ2hpamtsbW5vcA==", +): + """Uses the Data Loss Prevention API to reidentify sensitive data in a + string that was encrypted by Format Preserving Encryption (FPE) with + surrogate type. The encryption is performed with an unwrapped key. + Args: + project: The Google Cloud project id to use as a parent resource. + input_str: The string to deidentify (will be treated as text). + alphabet: The set of characters to replace sensitive ones with. For + more information, see https://cloud.google.com/dlp/docs/reference/ + rest/v2beta2/organizations.deidentifyTemplates#ffxcommonnativealphabet + surrogate_type: The name of the surrogate custom info type to used + during the encryption process. + unwrapped_key: The base64-encoded AES-256 key to use. + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # The unwrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + import base64 + + unwrapped_key = base64.b64decode(unwrapped_key) + + # Construct Deidentify Config + transformation = { + "primitive_transformation": { + "crypto_replace_ffx_fpe_config": { + "crypto_key": {"unwrapped": {"key": unwrapped_key}}, + "common_alphabet": alphabet, + "surrogate_info_type": {"name": surrogate_type}, + } + } + } + + reidentify_config = { + "info_type_transformations": {"transformations": [transformation]} + } + + inspect_config = { + "custom_info_types": [ + {"info_type": {"name": surrogate_type}, "surrogate_type": {}} + ] + } + + # Convert string to item + item = {"value": input_str} + + # Call the API + response = dlp.reidentify_content( + request={ + "parent": parent, + "reidentify_config": reidentify_config, + "inspect_config": inspect_config, + "item": item, + } + ) + + # Print results + print(response.item.value) + + +# [END dlp_reidentify_free_text_with_fpe_using_surrogate] + + +# [START dlp_deidentify_date_shift] +def deidentify_with_date_shift( + project, + input_csv_file=None, + output_csv_file=None, + date_fields=None, + lower_bound_days=None, + upper_bound_days=None, + context_field_id=None, + wrapped_key=None, + key_name=None, +): + """Uses the Data Loss Prevention API to deidentify dates in a CSV file by + pseudorandomly shifting them. + Args: + project: The Google Cloud project id to use as a parent resource. + input_csv_file: The path to the CSV file to deidentify. The first row + of the file must specify column names, and all other rows must + contain valid values. + output_csv_file: The path to save the date-shifted CSV file. + date_fields: The list of (date) fields in the CSV file to date shift. + Example: ['birth_date', 'register_date'] + lower_bound_days: The maximum number of days to shift a date backward + upper_bound_days: The maximum number of days to shift a date forward + context_field_id: (Optional) The column to determine date shift amount + based on. If this is not specified, a random shift amount will be + used for every row. If this is specified, then 'wrappedKey' and + 'keyName' must also be set. Example: + contextFieldId = [{ 'name': 'user_id' }] + key_name: (Optional) The name of the Cloud KMS key used to encrypt + ('wrap') the AES-256 key. Example: + key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/ + keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME' + wrapped_key: (Optional) The encrypted ('wrapped') AES-256 key to use. + This key should be encrypted using the Cloud KMS key specified by + key_name. + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Convert date field list to Protobuf type + def map_fields(field): + return {"name": field} + + if date_fields: + date_fields = map(map_fields, date_fields) + else: + date_fields = [] + + # Read and parse the CSV file + import csv + from datetime import datetime + + f = [] + with open(input_csv_file, "r") as csvfile: + reader = csv.reader(csvfile) + for row in reader: + f.append(row) + + # Helper function for converting CSV rows to Protobuf types + def map_headers(header): + return {"name": header} + + def map_data(value): + try: + date = datetime.strptime(value, "%m/%d/%Y") + return { + "date_value": {"year": date.year, "month": date.month, "day": date.day} + } + except ValueError: + return {"string_value": value} + + def map_rows(row): + return {"values": map(map_data, row)} + + # Using the helper functions, convert CSV rows to protobuf-compatible + # dictionaries. + csv_headers = map(map_headers, f[0]) + csv_rows = map(map_rows, f[1:]) + + # Construct the table dict + table_item = {"table": {"headers": csv_headers, "rows": csv_rows}} + + # Construct date shift config + date_shift_config = { + "lower_bound_days": lower_bound_days, + "upper_bound_days": upper_bound_days, + } + + # If using a Cloud KMS key, add it to the date_shift_config. + # The wrapped key is base64-encoded, but the library expects a binary + # string, so decode it here. + if context_field_id and key_name and wrapped_key: + import base64 + + date_shift_config["context"] = {"name": context_field_id} + date_shift_config["crypto_key"] = { + "kms_wrapped": { + "wrapped_key": base64.b64decode(wrapped_key), + "crypto_key_name": key_name, + } + } + elif context_field_id or key_name or wrapped_key: + raise ValueError( + """You must set either ALL or NONE of + [context_field_id, key_name, wrapped_key]!""" + ) + + # Construct Deidentify Config + deidentify_config = { + "record_transformations": { + "field_transformations": [ + { + "fields": date_fields, + "primitive_transformation": { + "date_shift_config": date_shift_config + }, + } + ] + } + } + + # Write to CSV helper methods + def write_header(header): + return header.name + + def write_data(data): + return data.string_value or "%s/%s/%s" % ( + data.date_value.month, + data.date_value.day, + data.date_value.year, + ) + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "item": table_item, + } + ) + + # Write results to CSV file + with open(output_csv_file, "w") as csvfile: + write_file = csv.writer(csvfile, delimiter=",") + write_file.writerow(map(write_header, response.item.table.headers)) + for row in response.item.table.rows: + write_file.writerow(map(write_data, row.values)) + # Print status + print("Successfully saved date-shift output to {}".format(output_csv_file)) + + +# [END dlp_deidentify_date_shift] + + +# [START dlp_deidentify_replace_infotype] +def deidentify_with_replace_infotype(project, item, info_types): + """Uses the Data Loss Prevention API to deidentify sensitive data in a + string by replacing it with the info type. + Args: + project: The Google Cloud project id to use as a parent resource. + item: The string to deidentify (will be treated as text). + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Construct inspect configuration dictionary + inspect_config = {"info_types": [{"name": info_type} for info_type in info_types]} + + # Construct deidentify configuration dictionary + deidentify_config = { + "info_type_transformations": { + "transformations": [ + {"primitive_transformation": {"replace_with_info_type_config": {}}} + ] + } + } + + # Call the API + response = dlp.deidentify_content( + request={ + "parent": parent, + "deidentify_config": deidentify_config, + "inspect_config": inspect_config, + "item": {"value": item}, + } + ) + + # Print out the results. + print(response.item.value) + + +# [END dlp_deidentify_replace_infotype] + + +if __name__ == "__main__": + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="content", help="Select how to submit content to the API." + ) + subparsers.required = True + + mask_parser = subparsers.add_parser( + "deid_mask", + help="Deidentify sensitive data in a string by masking it with a " "character.", + ) + mask_parser.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + mask_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + mask_parser.add_argument("item", help="The string to deidentify.") + mask_parser.add_argument( + "-n", + "--number_to_mask", + type=int, + default=0, + help="The maximum number of sensitive characters to mask in a match. " + "If omitted the request or set to 0, the API will mask any mathcing " + "characters.", + ) + mask_parser.add_argument( + "-m", + "--masking_character", + help="The character to mask matching sensitive data with.", + ) + + replace_parser = subparsers.add_parser( + "deid_replace", + help="Deidentify sensitive data in a string by replacing it with " + "another string.", + ) + replace_parser.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + replace_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + replace_parser.add_argument("item", help="The string to deidentify.") + replace_parser.add_argument( + "replacement_str", help="The string to " "replace all matched values with." + ) + + fpe_parser = subparsers.add_parser( + "deid_fpe", + help="Deidentify sensitive data in a string using Format Preserving " + "Encryption (FPE).", + ) + fpe_parser.add_argument( + "--info_types", + action="append", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + fpe_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + fpe_parser.add_argument( + "item", + help="The string to deidentify. " "Example: string = 'My SSN is 372819127'", + ) + fpe_parser.add_argument( + "key_name", + help="The name of the Cloud KMS key used to encrypt ('wrap') the " + "AES-256 key. Example: " + "key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/" + "keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'", + ) + fpe_parser.add_argument( + "wrapped_key", + help="The encrypted ('wrapped') AES-256 key to use. This key should " + "be encrypted using the Cloud KMS key specified by key_name.", + ) + fpe_parser.add_argument( + "-a", + "--alphabet", + default="ALPHA_NUMERIC", + help="The set of characters to replace sensitive ones with. Commonly " + 'used subsets of the alphabet include "NUMERIC", "HEXADECIMAL", ' + '"UPPER_CASE_ALPHA_NUMERIC", "ALPHA_NUMERIC", ' + '"FFX_COMMON_NATIVE_ALPHABET_UNSPECIFIED"', + ) + fpe_parser.add_argument( + "-s", + "--surrogate_type", + help="The name of the surrogate custom info type to use. Only " + "necessary if you want to reverse the deidentification process. Can " + "be essentially any arbitrary string, as long as it doesn't appear " + "in your dataset otherwise.", + ) + + reid_parser = subparsers.add_parser( + "reid_fpe", + help="Reidentify sensitive data in a string using Format Preserving " + "Encryption (FPE).", + ) + reid_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + reid_parser.add_argument( + "item", + help="The string to deidentify. " "Example: string = 'My SSN is 372819127'", + ) + reid_parser.add_argument( + "surrogate_type", + help="The name of the surrogate custom info type to use. Only " + "necessary if you want to reverse the deidentification process. Can " + "be essentially any arbitrary string, as long as it doesn't appear " + "in your dataset otherwise.", + ) + reid_parser.add_argument( + "key_name", + help="The name of the Cloud KMS key used to encrypt ('wrap') the " + "AES-256 key. Example: " + "key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/" + "keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'", + ) + reid_parser.add_argument( + "wrapped_key", + help="The encrypted ('wrapped') AES-256 key to use. This key should " + "be encrypted using the Cloud KMS key specified by key_name.", + ) + reid_parser.add_argument( + "-a", + "--alphabet", + default="ALPHA_NUMERIC", + help="The set of characters to replace sensitive ones with. Commonly " + 'used subsets of the alphabet include "NUMERIC", "HEXADECIMAL", ' + '"UPPER_CASE_ALPHA_NUMERIC", "ALPHA_NUMERIC", ' + '"FFX_COMMON_NATIVE_ALPHABET_UNSPECIFIED"', + ) + + date_shift_parser = subparsers.add_parser( + "deid_date_shift", + help="Deidentify dates in a CSV file by pseudorandomly shifting them.", + ) + date_shift_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + date_shift_parser.add_argument( + "input_csv_file", + help="The path to the CSV file to deidentify. The first row of the " + "file must specify column names, and all other rows must contain " + "valid values.", + ) + date_shift_parser.add_argument( + "output_csv_file", help="The path to save the date-shifted CSV file." + ) + date_shift_parser.add_argument( + "lower_bound_days", + type=int, + help="The maximum number of days to shift a date backward", + ) + date_shift_parser.add_argument( + "upper_bound_days", + type=int, + help="The maximum number of days to shift a date forward", + ) + date_shift_parser.add_argument( + "date_fields", + nargs="+", + help="The list of date fields in the CSV file to date shift. Example: " + "['birth_date', 'register_date']", + ) + date_shift_parser.add_argument( + "--context_field_id", + help="(Optional) The column to determine date shift amount based on. " + "If this is not specified, a random shift amount will be used for " + "every row. If this is specified, then 'wrappedKey' and 'keyName' " + "must also be set.", + ) + date_shift_parser.add_argument( + "--key_name", + help="(Optional) The name of the Cloud KMS key used to encrypt " + "('wrap') the AES-256 key. Example: " + "key_name = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/" + "keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME'", + ) + date_shift_parser.add_argument( + "--wrapped_key", + help="(Optional) The encrypted ('wrapped') AES-256 key to use. This " + "key should be encrypted using the Cloud KMS key specified by" + "key_name.", + ) + + replace_with_infotype_parser = subparsers.add_parser( + "replace_with_infotype", + help="Deidentify sensitive data in a string by replacing it with the " + "info type of the data.", + ) + replace_with_infotype_parser.add_argument( + "--info_types", + action="append", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + replace_with_infotype_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + replace_with_infotype_parser.add_argument( + "item", + help="The string to deidentify." + "Example: 'My credit card is 4242 4242 4242 4242'", + ) + + args = parser.parse_args() + + if args.content == "deid_mask": + deidentify_with_mask( + args.project, + args.item, + args.info_types, + masking_character=args.masking_character, + number_to_mask=args.number_to_mask, + ) + elif args.content == "deid_replace": + deidentify_with_replace( + args.project, + args.item, + args.info_types, + replacement_str=args.replacement_str, + ) + elif args.content == "deid_fpe": + deidentify_with_fpe( + args.project, + args.item, + args.info_types, + alphabet=args.alphabet, + wrapped_key=args.wrapped_key, + key_name=args.key_name, + surrogate_type=args.surrogate_type, + ) + elif args.content == "reid_fpe": + reidentify_with_fpe( + args.project, + args.item, + surrogate_type=args.surrogate_type, + wrapped_key=args.wrapped_key, + key_name=args.key_name, + alphabet=args.alphabet, + ) + elif args.content == "deid_date_shift": + deidentify_with_date_shift( + args.project, + input_csv_file=args.input_csv_file, + output_csv_file=args.output_csv_file, + lower_bound_days=args.lower_bound_days, + upper_bound_days=args.upper_bound_days, + date_fields=args.date_fields, + context_field_id=args.context_field_id, + wrapped_key=args.wrapped_key, + key_name=args.key_name, + ) + elif args.content == "replace_with_infotype": + deidentify_with_replace_infotype( + args.project, + item=args.item, + info_types=args.info_types, + ) diff --git a/dlp/snippets/deid_test.py b/dlp/snippets/deid_test.py new file mode 100644 index 000000000000..d6df2e6bae4a --- /dev/null +++ b/dlp/snippets/deid_test.py @@ -0,0 +1,291 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import shutil +import tempfile + +import google.cloud.dlp_v2 +import pytest + +import deid + +HARMFUL_STRING = "My SSN is 372819127" +HARMLESS_STRING = "My favorite color is blue" +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") +UNWRAPPED_KEY = "YWJjZGVmZ2hpamtsbW5vcA==" +WRAPPED_KEY = ( + "CiQAz0hX4+go8fJwn80Fr8pVImwx+tmZdqU7JL+7TN/S5JxBU9gSSQDhFHpFVy" + "uzJps0YH9ls480mU+JLG7jI/0lL04i6XJRWqmI6gUSZRUtECYcLH5gXK4SXHlL" + "rotx7Chxz/4z7SIpXFOBY61z0/U=" +) +KEY_NAME = ( + f"projects/{GCLOUD_PROJECT}/locations/global/keyRings/" + "dlp-test/cryptoKeys/dlp-test" +) +SURROGATE_TYPE = "SSN_TOKEN" +CSV_FILE = os.path.join(os.path.dirname(__file__), "resources/dates.csv") +DATE_SHIFTED_AMOUNT = 30 +DATE_FIELDS = ["birth_date", "register_date"] +CSV_CONTEXT_FIELD = "name" + + +@pytest.fixture(scope="module") +def tempdir(): + tempdir = tempfile.mkdtemp() + yield tempdir + shutil.rmtree(tempdir) + + +def test_deidentify_with_mask(capsys): + deid.deidentify_with_mask( + GCLOUD_PROJECT, HARMFUL_STRING, ["US_SOCIAL_SECURITY_NUMBER"] + ) + + out, _ = capsys.readouterr() + assert "My SSN is *********" in out + + +def test_deidentify_with_mask_ignore_insensitive_data(capsys): + deid.deidentify_with_mask( + GCLOUD_PROJECT, HARMLESS_STRING, ["US_SOCIAL_SECURITY_NUMBER"] + ) + + out, _ = capsys.readouterr() + assert HARMLESS_STRING in out + + +def test_deidentify_with_mask_masking_character_specified(capsys): + deid.deidentify_with_mask( + GCLOUD_PROJECT, + HARMFUL_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + masking_character="#", + ) + + out, _ = capsys.readouterr() + assert "My SSN is #########" in out + + +def test_deidentify_with_mask_masking_number_specified(capsys): + deid.deidentify_with_mask( + GCLOUD_PROJECT, + HARMFUL_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + number_to_mask=7, + ) + + out, _ = capsys.readouterr() + assert "My SSN is *******27" in out + + +def test_deidentify_with_redact(capsys): + deid.deidentify_with_redact( + GCLOUD_PROJECT, HARMFUL_STRING + "!", ["US_SOCIAL_SECURITY_NUMBER"] + ) + out, _ = capsys.readouterr() + assert "My SSN is !" in out + + +def test_deidentify_with_replace(capsys): + deid.deidentify_with_replace( + GCLOUD_PROJECT, + HARMFUL_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + replacement_str="REPLACEMENT_STR", + ) + + out, _ = capsys.readouterr() + assert "My SSN is REPLACEMENT_STR" in out + + +def test_deidentify_with_fpe(capsys): + deid.deidentify_with_fpe( + GCLOUD_PROJECT, + HARMFUL_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + alphabet=google.cloud.dlp_v2.CharsToIgnore.CommonCharsToIgnore.NUMERIC, + wrapped_key=WRAPPED_KEY, + key_name=KEY_NAME, + ) + + out, _ = capsys.readouterr() + assert "My SSN is" in out + assert "372819127" not in out + + +def test_deidentify_with_deterministic(capsys): + deid.deidentify_with_deterministic( + GCLOUD_PROJECT, + HARMFUL_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + surrogate_type=SURROGATE_TYPE, + key_name=KEY_NAME, + wrapped_key=WRAPPED_KEY, + ) + + out, _ = capsys.readouterr() + assert "My SSN is" in out + assert "372819127" not in out + + +def test_deidentify_with_fpe_uses_surrogate_info_types(capsys): + deid.deidentify_with_fpe( + GCLOUD_PROJECT, + HARMFUL_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + alphabet=google.cloud.dlp_v2.CharsToIgnore.CommonCharsToIgnore.NUMERIC, + wrapped_key=WRAPPED_KEY, + key_name=KEY_NAME, + surrogate_type=SURROGATE_TYPE, + ) + + out, _ = capsys.readouterr() + assert "My SSN is SSN_TOKEN" in out + assert "372819127" not in out + + +def test_deidentify_with_fpe_ignores_insensitive_data(capsys): + deid.deidentify_with_fpe( + GCLOUD_PROJECT, + HARMLESS_STRING, + ["US_SOCIAL_SECURITY_NUMBER"], + alphabet=google.cloud.dlp_v2.CharsToIgnore.CommonCharsToIgnore.NUMERIC, + wrapped_key=WRAPPED_KEY, + key_name=KEY_NAME, + ) + + out, _ = capsys.readouterr() + assert HARMLESS_STRING in out + + +def test_deidentify_with_date_shift(tempdir, capsys): + output_filepath = os.path.join(tempdir, "dates-shifted.csv") + + deid.deidentify_with_date_shift( + GCLOUD_PROJECT, + input_csv_file=CSV_FILE, + output_csv_file=output_filepath, + lower_bound_days=DATE_SHIFTED_AMOUNT, + upper_bound_days=DATE_SHIFTED_AMOUNT, + date_fields=DATE_FIELDS, + ) + + out, _ = capsys.readouterr() + + assert "Successful" in out + + +def test_deidentify_with_date_shift_using_context_field(tempdir, capsys): + output_filepath = os.path.join(tempdir, "dates-shifted.csv") + + deid.deidentify_with_date_shift( + GCLOUD_PROJECT, + input_csv_file=CSV_FILE, + output_csv_file=output_filepath, + lower_bound_days=DATE_SHIFTED_AMOUNT, + upper_bound_days=DATE_SHIFTED_AMOUNT, + date_fields=DATE_FIELDS, + context_field_id=CSV_CONTEXT_FIELD, + wrapped_key=WRAPPED_KEY, + key_name=KEY_NAME, + ) + + out, _ = capsys.readouterr() + + assert "Successful" in out + + +def test_reidentify_with_fpe(capsys): + labeled_fpe_string = "My SSN is SSN_TOKEN(9):731997681" + + deid.reidentify_with_fpe( + GCLOUD_PROJECT, + labeled_fpe_string, + surrogate_type=SURROGATE_TYPE, + wrapped_key=WRAPPED_KEY, + key_name=KEY_NAME, + alphabet=google.cloud.dlp_v2.CharsToIgnore.CommonCharsToIgnore.NUMERIC, + ) + + out, _ = capsys.readouterr() + + assert "731997681" not in out + + +def test_reidentify_with_deterministic(capsys): + labeled_fpe_string = "My SSN is SSN_TOKEN(36):ATeRUd3WWnAHHFtjtl1bv+CT09FZ7hyqNas=" + + deid.reidentify_with_deterministic( + GCLOUD_PROJECT, + labeled_fpe_string, + surrogate_type=SURROGATE_TYPE, + key_name=KEY_NAME, + wrapped_key=WRAPPED_KEY, + ) + + out, _ = capsys.readouterr() + + assert "SSN_TOKEN(" not in out + + +def test_deidentify_free_text_with_fpe_using_surrogate(capsys): + labeled_fpe_string = "My phone number is 4359916732" + + deid.deidentify_free_text_with_fpe_using_surrogate( + GCLOUD_PROJECT, + labeled_fpe_string, + info_type="PHONE_NUMBER", + surrogate_type="PHONE_TOKEN", + unwrapped_key=UNWRAPPED_KEY, + alphabet=google.cloud.dlp_v2.CharsToIgnore.CommonCharsToIgnore.NUMERIC, + ) + + out, _ = capsys.readouterr() + + assert "PHONE_TOKEN" in out + assert "My phone number is" in out + assert "4359916732" not in out + + +def test_reidentify_free_text_with_fpe_using_surrogate(capsys): + labeled_fpe_string = "My phone number is PHONE_TOKEN(10):9617256398" + + deid.reidentify_free_text_with_fpe_using_surrogate( + GCLOUD_PROJECT, + labeled_fpe_string, + surrogate_type="PHONE_TOKEN", + unwrapped_key=UNWRAPPED_KEY, + alphabet=google.cloud.dlp_v2.CharsToIgnore.CommonCharsToIgnore.NUMERIC, + ) + + out, _ = capsys.readouterr() + + assert "PHONE_TOKEN" not in out + assert "9617256398" not in out + assert "My phone number is" in out + + +def test_deidentify_with_replace_infotype(capsys): + url_to_redact = "https://cloud.google.com" + deid.deidentify_with_replace_infotype( + GCLOUD_PROJECT, + "My favorite site is " + url_to_redact, + ["URL"], + ) + + out, _ = capsys.readouterr() + + assert url_to_redact not in out + assert "My favorite site is [URL]" in out diff --git a/dlp/snippets/inspect_content.py b/dlp/snippets/inspect_content.py new file mode 100644 index 000000000000..b8a7d5599fe9 --- /dev/null +++ b/dlp/snippets/inspect_content.py @@ -0,0 +1,1438 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that uses the Data Loss Prevention API to inspect a string, a +local file or a file on Google Cloud Storage.""" + +from __future__ import print_function + +import argparse +import json +import os + + +# [START dlp_inspect_string_basic] +def inspect_string_basic( + project, + content_string, + info_types=["PHONE_NUMBER"], +): + """Uses the Data Loss Prevention API to analyze strings for protected data. + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + info_types = [{"name": info_type} for info_type in info_types] + + # Construct the configuration dictionary. + inspect_config = { + "info_types": info_types, + "include_quote": True, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + print("Quote: {}".format(finding.quote)) + print("Info type: {}".format(finding.info_type.name)) + print("Likelihood: {}".format(finding.likelihood)) + else: + print("No findings.") + + +# [END dlp_inspect_string_basic] + + +# [START dlp_inspect_string] +def inspect_string( + project, + content_string, + info_types, + custom_dictionaries=None, + custom_regexes=None, + min_likelihood=None, + max_findings=None, + include_quote=True, +): + """Uses the Data Loss Prevention API to analyze strings for protected data. + Args: + project: The Google Cloud project id to use as a parent resource. + content_string: The string to inspect. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + include_quote: Boolean for whether to display a quote of the detected + information in the results. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare custom_info_types by parsing the dictionary word lists and + # regex patterns. + if custom_dictionaries is None: + custom_dictionaries = [] + dictionaries = [ + { + "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)}, + "dictionary": {"word_list": {"words": custom_dict.split(",")}}, + } + for i, custom_dict in enumerate(custom_dictionaries) + ] + if custom_regexes is None: + custom_regexes = [] + regexes = [ + { + "info_type": {"name": "CUSTOM_REGEX_{}".format(i)}, + "regex": {"pattern": custom_regex}, + } + for i, custom_regex in enumerate(custom_regexes) + ] + custom_info_types = dictionaries + regexes + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "custom_info_types": custom_info_types, + "min_likelihood": min_likelihood, + "include_quote": include_quote, + "limits": {"max_findings_per_request": max_findings}, + } + + # Construct the `item`. + item = {"value": content_string} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + try: + if finding.quote: + print("Quote: {}".format(finding.quote)) + except AttributeError: + pass + print("Info type: {}".format(finding.info_type.name)) + print("Likelihood: {}".format(finding.likelihood)) + else: + print("No findings.") + + +# [END dlp_inspect_string] + +# [START dlp_inspect_table] + + +def inspect_table( + project, + data, + info_types, + custom_dictionaries=None, + custom_regexes=None, + min_likelihood=None, + max_findings=None, + include_quote=True, +): + """Uses the Data Loss Prevention API to analyze strings for protected data. + Args: + project: The Google Cloud project id to use as a parent resource. + data: Json string representing table data. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + include_quote: Boolean for whether to display a quote of the detected + information in the results. + Returns: + None; the response from the API is printed to the terminal. + Example: + data = { + "header":[ + "email", + "phone number" + ], + "rows":[ + [ + "robertfrost@xyz.com", + "4232342345" + ], + [ + "johndoe@pqr.com", + "4253458383" + ] + ] + } + + >> $ python inspect_content.py table \ + '{"header": ["email", "phone number"], + "rows": [["robertfrost@xyz.com", "4232342345"], + ["johndoe@pqr.com", "4253458383"]]}' + >> Quote: robertfrost@xyz.com + Info type: EMAIL_ADDRESS + Likelihood: 4 + Quote: johndoe@pqr.com + Info type: EMAIL_ADDRESS + Likelihood: 4 + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare custom_info_types by parsing the dictionary word lists and + # regex patterns. + if custom_dictionaries is None: + custom_dictionaries = [] + dictionaries = [ + { + "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)}, + "dictionary": {"word_list": {"words": custom_dict.split(",")}}, + } + for i, custom_dict in enumerate(custom_dictionaries) + ] + if custom_regexes is None: + custom_regexes = [] + regexes = [ + { + "info_type": {"name": "CUSTOM_REGEX_{}".format(i)}, + "regex": {"pattern": custom_regex}, + } + for i, custom_regex in enumerate(custom_regexes) + ] + custom_info_types = dictionaries + regexes + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "custom_info_types": custom_info_types, + "min_likelihood": min_likelihood, + "include_quote": include_quote, + "limits": {"max_findings_per_request": max_findings}, + } + + # Construct the `table`. For more details on the table schema, please see + # https://cloud.google.com/dlp/docs/reference/rest/v2/ContentItem#Table + headers = [{"name": val} for val in data["header"]] + rows = [] + for row in data["rows"]: + rows.append({"values": [{"string_value": cell_val} for cell_val in row]}) + + table = {} + table["headers"] = headers + table["rows"] = rows + item = {"table": table} + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + try: + if finding.quote: + print("Quote: {}".format(finding.quote)) + except AttributeError: + pass + print("Info type: {}".format(finding.info_type.name)) + print("Likelihood: {}".format(finding.likelihood)) + else: + print("No findings.") + + +# [END dlp_inspect_table] + +# [START dlp_inspect_file] + + +def inspect_file( + project, + filename, + info_types, + min_likelihood=None, + custom_dictionaries=None, + custom_regexes=None, + max_findings=None, + include_quote=True, + mime_type=None, +): + """Uses the Data Loss Prevention API to analyze a file for protected data. + Args: + project: The Google Cloud project id to use as a parent resource. + filename: The path to the file to inspect. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + include_quote: Boolean for whether to display a quote of the detected + information in the results. + mime_type: The MIME type of the file. If not specified, the type is + inferred via the Python standard library's mimetypes module. + Returns: + None; the response from the API is printed to the terminal. + """ + + import mimetypes + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + if not info_types: + info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"] + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare custom_info_types by parsing the dictionary word lists and + # regex patterns. + if custom_dictionaries is None: + custom_dictionaries = [] + dictionaries = [ + { + "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)}, + "dictionary": {"word_list": {"words": custom_dict.split(",")}}, + } + for i, custom_dict in enumerate(custom_dictionaries) + ] + if custom_regexes is None: + custom_regexes = [] + regexes = [ + { + "info_type": {"name": "CUSTOM_REGEX_{}".format(i)}, + "regex": {"pattern": custom_regex}, + } + for i, custom_regex in enumerate(custom_regexes) + ] + custom_info_types = dictionaries + regexes + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "custom_info_types": custom_info_types, + "min_likelihood": min_likelihood, + "include_quote": include_quote, + "limits": {"max_findings_per_request": max_findings}, + } + + # If mime_type is not specified, guess it from the filename. + if mime_type is None: + mime_guess = mimetypes.MimeTypes().guess_type(filename) + mime_type = mime_guess[0] + + # Select the content type index from the list of supported types. + supported_content_types = { + None: 0, # "Unspecified" + "image/jpeg": 1, + "image/bmp": 2, + "image/png": 3, + "image/svg": 4, + "text/plain": 5, + } + content_type_index = supported_content_types.get(mime_type, 0) + + # Construct the item, containing the file's byte data. + with open(filename, mode="rb") as f: + item = {"byte_item": {"type_": content_type_index, "data": f.read()}} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + try: + print("Quote: {}".format(finding.quote)) + except AttributeError: + pass + print("Info type: {}".format(finding.info_type.name)) + print("Likelihood: {}".format(finding.likelihood)) + else: + print("No findings.") + + +# [END dlp_inspect_file] + + +# [START dlp_inspect_gcs] +def inspect_gcs_file( + project, + bucket, + filename, + topic_id, + subscription_id, + info_types, + custom_dictionaries=None, + custom_regexes=None, + min_likelihood=None, + max_findings=None, + timeout=300, +): + """Uses the Data Loss Prevention API to analyze a file on GCS. + Args: + project: The Google Cloud project id to use as a parent resource. + bucket: The name of the GCS bucket containing the file, as a string. + filename: The name of the file in the bucket, including the path, as a + string; e.g. 'images/myfile.png'. + topic_id: The id of the Cloud Pub/Sub topic to which the API will + broadcast job completion. The topic must already exist. + subscription_id: The id of the Cloud Pub/Sub subscription to listen on + while waiting for job completion. The subscription must already + exist and be subscribed to the topic. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + timeout: The number of seconds to wait for a response from the API. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + # This sample also uses threading.Event() to wait for the job to finish. + import threading + + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + if not info_types: + info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"] + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare custom_info_types by parsing the dictionary word lists and + # regex patterns. + if custom_dictionaries is None: + custom_dictionaries = [] + dictionaries = [ + { + "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)}, + "dictionary": {"word_list": {"words": custom_dict.split(",")}}, + } + for i, custom_dict in enumerate(custom_dictionaries) + ] + if custom_regexes is None: + custom_regexes = [] + regexes = [ + { + "info_type": {"name": "CUSTOM_REGEX_{}".format(i)}, + "regex": {"pattern": custom_regex}, + } + for i, custom_regex in enumerate(custom_regexes) + ] + custom_info_types = dictionaries + regexes + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "custom_info_types": custom_info_types, + "min_likelihood": min_likelihood, + "limits": {"max_findings_per_request": max_findings}, + } + + # Construct a storage_config containing the file's URL. + url = "gs://{}/{}".format(bucket, filename) + storage_config = {"cloud_storage_options": {"file_set": {"url": url}}} + + # Convert the project id into full resource ids. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Construct the inspect_job, which defines the entire inspect content task. + inspect_job = { + "inspect_config": inspect_config, + "storage_config": storage_config, + "actions": actions, + } + + operation = dlp.create_dlp_job( + request={"parent": parent, "inspect_job": inspect_job} + ) + print("Inspection operation started: {}".format(operation.name)) + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + + # Set up a callback to acknowledge a message. This closes around an event + # so that it can signal that it is done and the main thread can continue. + job_done = threading.Event() + + def callback(message): + try: + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + if job.inspect_details.result.info_type_stats: + for finding in job.inspect_details.result.info_type_stats: + print( + "Info type: {}; Count: {}".format( + finding.info_type.name, finding.count + ) + ) + else: + print("No findings.") + + # Signal to the main thread that we can exit. + job_done.set() + else: + # This is not the message we're looking for. + message.drop() + except Exception as e: + # Because this is executing in a thread, an exception won't be + # noted unless we print it manually. + print(e) + raise + + subscriber.subscribe(subscription_path, callback=callback) + finished = job_done.wait(timeout=timeout) + if not finished: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + + +# [END dlp_inspect_gcs] + + +# [START dlp_inspect_datastore] +def inspect_datastore( + project, + datastore_project, + kind, + topic_id, + subscription_id, + info_types, + custom_dictionaries=None, + custom_regexes=None, + namespace_id=None, + min_likelihood=None, + max_findings=None, + timeout=300, +): + """Uses the Data Loss Prevention API to analyze Datastore data. + Args: + project: The Google Cloud project id to use as a parent resource. + datastore_project: The Google Cloud project id of the target Datastore. + kind: The kind of the Datastore entity to inspect, e.g. 'Person'. + topic_id: The id of the Cloud Pub/Sub topic to which the API will + broadcast job completion. The topic must already exist. + subscription_id: The id of the Cloud Pub/Sub subscription to listen on + while waiting for job completion. The subscription must already + exist and be subscribed to the topic. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + namespace_id: The namespace of the Datastore document, if applicable. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + timeout: The number of seconds to wait for a response from the API. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + # This sample also uses threading.Event() to wait for the job to finish. + import threading + + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + if not info_types: + info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"] + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare custom_info_types by parsing the dictionary word lists and + # regex patterns. + if custom_dictionaries is None: + custom_dictionaries = [] + dictionaries = [ + { + "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)}, + "dictionary": {"word_list": {"words": custom_dict.split(",")}}, + } + for i, custom_dict in enumerate(custom_dictionaries) + ] + if custom_regexes is None: + custom_regexes = [] + regexes = [ + { + "info_type": {"name": "CUSTOM_REGEX_{}".format(i)}, + "regex": {"pattern": custom_regex}, + } + for i, custom_regex in enumerate(custom_regexes) + ] + custom_info_types = dictionaries + regexes + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "custom_info_types": custom_info_types, + "min_likelihood": min_likelihood, + "limits": {"max_findings_per_request": max_findings}, + } + + # Construct a storage_config containing the target Datastore info. + storage_config = { + "datastore_options": { + "partition_id": { + "project_id": datastore_project, + "namespace_id": namespace_id, + }, + "kind": {"name": kind}, + } + } + + # Convert the project id into full resource ids. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Construct the inspect_job, which defines the entire inspect content task. + inspect_job = { + "inspect_config": inspect_config, + "storage_config": storage_config, + "actions": actions, + } + + operation = dlp.create_dlp_job( + request={"parent": parent, "inspect_job": inspect_job} + ) + print("Inspection operation started: {}".format(operation.name)) + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + + # Set up a callback to acknowledge a message. This closes around an event + # so that it can signal that it is done and the main thread can continue. + job_done = threading.Event() + + def callback(message): + try: + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + if job.inspect_details.result.info_type_stats: + for finding in job.inspect_details.result.info_type_stats: + print( + "Info type: {}; Count: {}".format( + finding.info_type.name, finding.count + ) + ) + else: + print("No findings.") + + # Signal to the main thread that we can exit. + job_done.set() + else: + # This is not the message we're looking for. + message.drop() + except Exception as e: + # Because this is executing in a thread, an exception won't be + # noted unless we print it manually. + print(e) + raise + + # Register the callback and wait on the event. + subscriber.subscribe(subscription_path, callback=callback) + + finished = job_done.wait(timeout=timeout) + if not finished: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + + +# [END dlp_inspect_datastore] + + +# [START dlp_inspect_bigquery] +def inspect_bigquery( + project, + bigquery_project, + dataset_id, + table_id, + topic_id, + subscription_id, + info_types, + custom_dictionaries=None, + custom_regexes=None, + min_likelihood=None, + max_findings=None, + timeout=500, +): + """Uses the Data Loss Prevention API to analyze BigQuery data. + Args: + project: The Google Cloud project id to use as a parent resource. + bigquery_project: The Google Cloud project id of the target table. + dataset_id: The id of the target BigQuery dataset. + table_id: The id of the target BigQuery table. + topic_id: The id of the Cloud Pub/Sub topic to which the API will + broadcast job completion. The topic must already exist. + subscription_id: The id of the Cloud Pub/Sub subscription to listen on + while waiting for job completion. The subscription must already + exist and be subscribed to the topic. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + namespace_id: The namespace of the Datastore document, if applicable. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + timeout: The number of seconds to wait for a response from the API. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + # This sample also uses threading.Event() to wait for the job to finish. + import threading + + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + if not info_types: + info_types = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"] + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare custom_info_types by parsing the dictionary word lists and + # regex patterns. + if custom_dictionaries is None: + custom_dictionaries = [] + dictionaries = [ + { + "info_type": {"name": "CUSTOM_DICTIONARY_{}".format(i)}, + "dictionary": {"word_list": {"words": custom_dict.split(",")}}, + } + for i, custom_dict in enumerate(custom_dictionaries) + ] + if custom_regexes is None: + custom_regexes = [] + regexes = [ + { + "info_type": {"name": "CUSTOM_REGEX_{}".format(i)}, + "regex": {"pattern": custom_regex}, + } + for i, custom_regex in enumerate(custom_regexes) + ] + custom_info_types = dictionaries + regexes + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "custom_info_types": custom_info_types, + "min_likelihood": min_likelihood, + "limits": {"max_findings_per_request": max_findings}, + } + + # Construct a storage_config containing the target Bigquery info. + storage_config = { + "big_query_options": { + "table_reference": { + "project_id": bigquery_project, + "dataset_id": dataset_id, + "table_id": table_id, + } + } + } + + # Convert the project id into full resource ids. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Construct the inspect_job, which defines the entire inspect content task. + inspect_job = { + "inspect_config": inspect_config, + "storage_config": storage_config, + "actions": actions, + } + + operation = dlp.create_dlp_job( + request={"parent": parent, "inspect_job": inspect_job} + ) + print("Inspection operation started: {}".format(operation.name)) + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + + # Set up a callback to acknowledge a message. This closes around an event + # so that it can signal that it is done and the main thread can continue. + job_done = threading.Event() + + def callback(message): + try: + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + if job.inspect_details.result.info_type_stats: + for finding in job.inspect_details.result.info_type_stats: + print( + "Info type: {}; Count: {}".format( + finding.info_type.name, finding.count + ) + ) + else: + print("No findings.") + + # Signal to the main thread that we can exit. + job_done.set() + else: + # This is not the message we're looking for. + message.drop() + except Exception as e: + # Because this is executing in a thread, an exception won't be + # noted unless we print it manually. + print(e) + raise + + # Register the callback and wait on the event. + subscriber.subscribe(subscription_path, callback=callback) + finished = job_done.wait(timeout=timeout) + if not finished: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + + +# [END dlp_inspect_bigquery] + + +if __name__ == "__main__": + default_project = os.environ.get("GOOGLE_CLOUD_PROJECT") + + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="content", help="Select how to submit content to the API." + ) + subparsers.required = True + + parser_string = subparsers.add_parser("string", help="Inspect a string.") + parser_string.add_argument("item", help="The string to inspect.") + parser_string.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_string.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_string.add_argument( + "--custom_dictionaries", + action="append", + help="Strings representing comma-delimited lists of dictionary words" + " to search for as custom info types. Each string is a comma " + "delimited list of words representing a distinct dictionary.", + default=None, + ) + parser_string.add_argument( + "--custom_regexes", + action="append", + help="Strings representing regex patterns to search for as custom " + " info types.", + default=None, + ) + parser_string.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_string.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_string.add_argument( + "--include_quote", + type=bool, + help="A boolean for whether to display a quote of the detected " + "information in the results.", + default=True, + ) + + parser_table = subparsers.add_parser("table", help="Inspect a table.") + parser_table.add_argument( + "data", help="Json string representing a table.", type=json.loads + ) + parser_table.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_table.add_argument( + "--info_types", + action="append", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_table.add_argument( + "--custom_dictionaries", + action="append", + help="Strings representing comma-delimited lists of dictionary words" + " to search for as custom info types. Each string is a comma " + "delimited list of words representing a distinct dictionary.", + default=None, + ) + parser_table.add_argument( + "--custom_regexes", + action="append", + help="Strings representing regex patterns to search for as custom " + " info types.", + default=None, + ) + parser_table.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_table.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_table.add_argument( + "--include_quote", + type=bool, + help="A boolean for whether to display a quote of the detected " + "information in the results.", + default=True, + ) + + parser_file = subparsers.add_parser("file", help="Inspect a local file.") + parser_file.add_argument("filename", help="The path to the file to inspect.") + parser_file.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_file.add_argument( + "--info_types", + action="append", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_file.add_argument( + "--custom_dictionaries", + action="append", + help="Strings representing comma-delimited lists of dictionary words" + " to search for as custom info types. Each string is a comma " + "delimited list of words representing a distinct dictionary.", + default=None, + ) + parser_file.add_argument( + "--custom_regexes", + action="append", + help="Strings representing regex patterns to search for as custom " + " info types.", + default=None, + ) + parser_file.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_file.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_file.add_argument( + "--include_quote", + type=bool, + help="A boolean for whether to display a quote of the detected " + "information in the results.", + default=True, + ) + parser_file.add_argument( + "--mime_type", + help="The MIME type of the file. If not specified, the type is " + "inferred via the Python standard library's mimetypes module.", + ) + + parser_gcs = subparsers.add_parser( + "gcs", help="Inspect files on Google Cloud Storage." + ) + parser_gcs.add_argument( + "bucket", help="The name of the GCS bucket containing the file." + ) + parser_gcs.add_argument( + "filename", + help="The name of the file in the bucket, including the path, e.g. " + '"images/myfile.png". Wildcards are permitted.', + ) + parser_gcs.add_argument( + "topic_id", + help="The id of the Cloud Pub/Sub topic to use to report that the job " + 'is complete, e.g. "dlp-sample-topic".', + ) + parser_gcs.add_argument( + "subscription_id", + help="The id of the Cloud Pub/Sub subscription to monitor for job " + 'completion, e.g. "dlp-sample-subscription". The subscription must ' + "already be subscribed to the topic. See the test files or the Cloud " + "Pub/Sub sample files for examples on how to create the subscription.", + ) + parser_gcs.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_gcs.add_argument( + "--info_types", + action="append", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_gcs.add_argument( + "--custom_dictionaries", + action="append", + help="Strings representing comma-delimited lists of dictionary words" + " to search for as custom info types. Each string is a comma " + "delimited list of words representing a distinct dictionary.", + default=None, + ) + parser_gcs.add_argument( + "--custom_regexes", + action="append", + help="Strings representing regex patterns to search for as custom " + " info types.", + default=None, + ) + parser_gcs.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_gcs.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_gcs.add_argument( + "--timeout", + type=int, + help="The maximum number of seconds to wait for a response from the " + "API. The default is 300 seconds.", + default=300, + ) + + parser_datastore = subparsers.add_parser( + "datastore", help="Inspect files on Google Datastore." + ) + parser_datastore.add_argument( + "datastore_project", + help="The Google Cloud project id of the target Datastore.", + ) + parser_datastore.add_argument( + "kind", + help='The kind of the Datastore entity to inspect, e.g. "Person".', + ) + parser_datastore.add_argument( + "topic_id", + help="The id of the Cloud Pub/Sub topic to use to report that the job " + 'is complete, e.g. "dlp-sample-topic".', + ) + parser_datastore.add_argument( + "subscription_id", + help="The id of the Cloud Pub/Sub subscription to monitor for job " + 'completion, e.g. "dlp-sample-subscription". The subscription must ' + "already be subscribed to the topic. See the test files or the Cloud " + "Pub/Sub sample files for examples on how to create the subscription.", + ) + parser_datastore.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_datastore.add_argument( + "--info_types", + action="append", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_datastore.add_argument( + "--custom_dictionaries", + action="append", + help="Strings representing comma-delimited lists of dictionary words" + " to search for as custom info types. Each string is a comma " + "delimited list of words representing a distinct dictionary.", + default=None, + ) + parser_datastore.add_argument( + "--custom_regexes", + action="append", + help="Strings representing regex patterns to search for as custom " + " info types.", + default=None, + ) + parser_datastore.add_argument( + "--namespace_id", help="The Datastore namespace to use, if applicable." + ) + parser_datastore.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_datastore.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_datastore.add_argument( + "--timeout", + type=int, + help="The maximum number of seconds to wait for a response from the " + "API. The default is 300 seconds.", + default=300, + ) + + parser_bigquery = subparsers.add_parser( + "bigquery", help="Inspect files on Google BigQuery." + ) + parser_bigquery.add_argument( + "bigquery_project", + help="The Google Cloud project id of the target table.", + ) + parser_bigquery.add_argument( + "dataset_id", help="The ID of the target BigQuery dataset." + ) + parser_bigquery.add_argument( + "table_id", help="The ID of the target BigQuery table." + ) + parser_bigquery.add_argument( + "topic_id", + help="The id of the Cloud Pub/Sub topic to use to report that the job " + 'is complete, e.g. "dlp-sample-topic".', + ) + parser_bigquery.add_argument( + "subscription_id", + help="The id of the Cloud Pub/Sub subscription to monitor for job " + 'completion, e.g. "dlp-sample-subscription". The subscription must ' + "already be subscribed to the topic. See the test files or the Cloud " + "Pub/Sub sample files for examples on how to create the subscription.", + ) + parser_bigquery.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_bigquery.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_bigquery.add_argument( + "--custom_dictionaries", + action="append", + help="Strings representing comma-delimited lists of dictionary words" + " to search for as custom info types. Each string is a comma " + "delimited list of words representing a distinct dictionary.", + default=None, + ) + parser_bigquery.add_argument( + "--custom_regexes", + action="append", + help="Strings representing regex patterns to search for as custom " + " info types.", + default=None, + ) + parser_bigquery.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_bigquery.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_bigquery.add_argument( + "--timeout", + type=int, + help="The maximum number of seconds to wait for a response from the " + "API. The default is 300 seconds.", + default=300, + ) + + args = parser.parse_args() + + if args.content == "string": + inspect_string( + args.project, + args.item, + args.info_types, + custom_dictionaries=args.custom_dictionaries, + custom_regexes=args.custom_regexes, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + include_quote=args.include_quote, + ) + elif args.content == "table": + inspect_table( + args.project, + args.data, + args.info_types, + custom_dictionaries=args.custom_dictionaries, + custom_regexes=args.custom_regexes, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + include_quote=args.include_quote, + ) + elif args.content == "file": + inspect_file( + args.project, + args.filename, + args.info_types, + custom_dictionaries=args.custom_dictionaries, + custom_regexes=args.custom_regexes, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + include_quote=args.include_quote, + mime_type=args.mime_type, + ) + elif args.content == "gcs": + inspect_gcs_file( + args.project, + args.bucket, + args.filename, + args.topic_id, + args.subscription_id, + args.info_types, + custom_dictionaries=args.custom_dictionaries, + custom_regexes=args.custom_regexes, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + timeout=args.timeout, + ) + elif args.content == "datastore": + inspect_datastore( + args.project, + args.datastore_project, + args.kind, + args.topic_id, + args.subscription_id, + args.info_types, + custom_dictionaries=args.custom_dictionaries, + custom_regexes=args.custom_regexes, + namespace_id=args.namespace_id, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + timeout=args.timeout, + ) + elif args.content == "bigquery": + inspect_bigquery( + args.project, + args.bigquery_project, + args.dataset_id, + args.table_id, + args.topic_id, + args.subscription_id, + args.info_types, + custom_dictionaries=args.custom_dictionaries, + custom_regexes=args.custom_regexes, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + timeout=args.timeout, + ) diff --git a/dlp/snippets/inspect_content_test.py b/dlp/snippets/inspect_content_test.py new file mode 100644 index 000000000000..5697439e8b1a --- /dev/null +++ b/dlp/snippets/inspect_content_test.py @@ -0,0 +1,504 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import time +import uuid + +import backoff +import google.api_core.exceptions +from google.api_core.exceptions import ServiceUnavailable +import google.cloud.bigquery +import google.cloud.datastore +import google.cloud.dlp_v2 +import google.cloud.exceptions +import google.cloud.pubsub +import google.cloud.storage +import pytest + +import inspect_content + +UNIQUE_STRING = str(uuid.uuid4()).split("-")[0] + +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") +TEST_BUCKET_NAME = GCLOUD_PROJECT + "-dlp-python-client-test" + UNIQUE_STRING +RESOURCE_DIRECTORY = os.path.join(os.path.dirname(__file__), "resources") +RESOURCE_FILE_NAMES = ["test.txt", "test.png", "harmless.txt", "accounts.txt"] +TOPIC_ID = "dlp-test" + UNIQUE_STRING +SUBSCRIPTION_ID = "dlp-test-subscription" + UNIQUE_STRING +DATASTORE_KIND = "DLP test kind" +DATASTORE_NAME = "DLP test object" + UNIQUE_STRING +BIGQUERY_DATASET_ID = "dlp_test_dataset" + UNIQUE_STRING +BIGQUERY_TABLE_ID = "dlp_test_table" + UNIQUE_STRING + +TIMEOUT = 900 # 15 minutes + +DLP_CLIENT = google.cloud.dlp_v2.DlpServiceClient() + + +@pytest.fixture(scope="module") +def bucket(): + # Creates a GCS bucket, uploads files required for the test, and tears down + # the entire bucket afterwards. + + client = google.cloud.storage.Client() + try: + bucket = client.get_bucket(TEST_BUCKET_NAME) + except google.cloud.exceptions.NotFound: + bucket = client.create_bucket(TEST_BUCKET_NAME) + + # Upoad the blobs and keep track of them in a list. + blobs = [] + for name in RESOURCE_FILE_NAMES: + path = os.path.join(RESOURCE_DIRECTORY, name) + blob = bucket.blob(name) + blob.upload_from_filename(path) + blobs.append(blob) + + # Yield the object to the test; lines after this execute as a teardown. + yield bucket + + # Delete the files. + for blob in blobs: + try: + blob.delete() + except google.cloud.exceptions.NotFound: + print("Issue during teardown, missing blob") + + # Attempt to delete the bucket; this will only work if it is empty. + bucket.delete() + + +@pytest.fixture(scope="module") +def topic_id(): + # Creates a pubsub topic, and tears it down. + publisher = google.cloud.pubsub.PublisherClient() + topic_path = publisher.topic_path(GCLOUD_PROJECT, TOPIC_ID) + try: + publisher.create_topic(request={"name": topic_path}) + except google.api_core.exceptions.AlreadyExists: + pass + + yield TOPIC_ID + + publisher.delete_topic(request={"topic": topic_path}) + + +@pytest.fixture(scope="module") +def subscription_id(topic_id): + # Subscribes to a topic. + subscriber = google.cloud.pubsub.SubscriberClient() + topic_path = subscriber.topic_path(GCLOUD_PROJECT, topic_id) + subscription_path = subscriber.subscription_path(GCLOUD_PROJECT, SUBSCRIPTION_ID) + try: + subscriber.create_subscription( + request={"name": subscription_path, "topic": topic_path} + ) + except google.api_core.exceptions.AlreadyExists: + pass + + yield SUBSCRIPTION_ID + + subscriber.delete_subscription(request={"subscription": subscription_path}) + + +@pytest.fixture(scope="module") +def datastore_project(): + # Adds test Datastore data, yields the project ID and then tears down. + datastore_client = google.cloud.datastore.Client() + + kind = DATASTORE_KIND + name = DATASTORE_NAME + key = datastore_client.key(kind, name) + item = google.cloud.datastore.Entity(key=key) + item["payload"] = "My name is Gary Smith and my email is gary@example.com" + + datastore_client.put(item) + + yield GCLOUD_PROJECT + + @backoff.on_exception(backoff.expo, ServiceUnavailable, max_time=120) + def cleanup(): + datastore_client.delete(key) + + cleanup() + + +@pytest.fixture(scope="module") +def bigquery_project(): + # Adds test Bigquery data, yields the project ID and then tears down. + bigquery_client = google.cloud.bigquery.Client() + + dataset_ref = bigquery_client.dataset(BIGQUERY_DATASET_ID) + dataset = google.cloud.bigquery.Dataset(dataset_ref) + try: + dataset = bigquery_client.create_dataset(dataset) + except google.api_core.exceptions.Conflict: + dataset = bigquery_client.get_dataset(dataset) + + table_ref = dataset_ref.table(BIGQUERY_TABLE_ID) + table = google.cloud.bigquery.Table(table_ref) + + # DO NOT SUBMIT: trim this down once we find out what works + table.schema = ( + google.cloud.bigquery.SchemaField("Name", "STRING"), + google.cloud.bigquery.SchemaField("Comment", "STRING"), + ) + + try: + table = bigquery_client.create_table(table) + time.sleep(30) + except google.api_core.exceptions.Conflict: + table = bigquery_client.get_table(table) + + rows_to_insert = [("Gary Smith", "My email is gary@example.com")] + + bigquery_client.insert_rows(table, rows_to_insert) + + yield GCLOUD_PROJECT + + @backoff.on_exception(backoff.expo, ServiceUnavailable, max_time=120) + def cleanup(): + bigquery_client.delete_dataset(dataset_ref, delete_contents=True) + + cleanup() + + +def test_inspect_string_basic(capsys): + test_string = "String with a phone number: 234-555-6789" + + inspect_content.inspect_string_basic(GCLOUD_PROJECT, test_string) + + out, _ = capsys.readouterr() + assert "Info type: PHONE_NUMBER" in out + assert "Quote: 234-555-6789" in out + + +def test_inspect_string(capsys): + test_string = "My name is Gary Smith and my email is gary@example.com" + + inspect_content.inspect_string( + GCLOUD_PROJECT, + test_string, + ["FIRST_NAME", "EMAIL_ADDRESS"], + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "Info type: FIRST_NAME" in out + assert "Info type: EMAIL_ADDRESS" in out + + +def test_inspect_table(capsys): + test_tabular_data = { + "header": ["email", "phone number"], + "rows": [ + ["robertfrost@xyz.com", "4232342345"], + ["johndoe@pqr.com", "4253458383"], + ], + } + + inspect_content.inspect_table( + GCLOUD_PROJECT, + test_tabular_data, + ["PHONE_NUMBER", "EMAIL_ADDRESS"], + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "Info type: PHONE_NUMBER" in out + assert "Info type: EMAIL_ADDRESS" in out + + +def test_inspect_string_with_custom_info_types(capsys): + test_string = "My name is Gary Smith and my email is gary@example.com" + dictionaries = ["Gary Smith"] + regexes = ["\\w+@\\w+.com"] + + inspect_content.inspect_string( + GCLOUD_PROJECT, + test_string, + [], + custom_dictionaries=dictionaries, + custom_regexes=regexes, + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "Info type: CUSTOM_DICTIONARY_0" in out + assert "Info type: CUSTOM_REGEX_0" in out + + +def test_inspect_string_no_results(capsys): + test_string = "Nothing to see here" + + inspect_content.inspect_string( + GCLOUD_PROJECT, + test_string, + ["FIRST_NAME", "EMAIL_ADDRESS"], + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "No findings" in out + + +def test_inspect_file(capsys): + test_filepath = os.path.join(RESOURCE_DIRECTORY, "test.txt") + + inspect_content.inspect_file( + GCLOUD_PROJECT, + test_filepath, + ["FIRST_NAME", "EMAIL_ADDRESS"], + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "Info type: EMAIL_ADDRESS" in out + + +def test_inspect_file_with_custom_info_types(capsys): + test_filepath = os.path.join(RESOURCE_DIRECTORY, "test.txt") + dictionaries = ["gary@somedomain.com"] + regexes = ["\\(\\d{3}\\) \\d{3}-\\d{4}"] + + inspect_content.inspect_file( + GCLOUD_PROJECT, + test_filepath, + [], + custom_dictionaries=dictionaries, + custom_regexes=regexes, + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "Info type: CUSTOM_DICTIONARY_0" in out + assert "Info type: CUSTOM_REGEX_0" in out + + +def test_inspect_file_no_results(capsys): + test_filepath = os.path.join(RESOURCE_DIRECTORY, "harmless.txt") + + inspect_content.inspect_file( + GCLOUD_PROJECT, + test_filepath, + ["FIRST_NAME", "EMAIL_ADDRESS"], + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "No findings" in out + + +def test_inspect_image_file(capsys): + test_filepath = os.path.join(RESOURCE_DIRECTORY, "test.png") + + inspect_content.inspect_file( + GCLOUD_PROJECT, + test_filepath, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + include_quote=True, + ) + + out, _ = capsys.readouterr() + assert "Info type: PHONE_NUMBER" in out + + +def delete_dlp_job(out): + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_gcs_file(bucket, topic_id, subscription_id, capsys): + out = "" + try: + inspect_content.inspect_gcs_file( + GCLOUD_PROJECT, + bucket.name, + "test.txt", + topic_id, + subscription_id, + ["EMAIL_ADDRESS", "PHONE_NUMBER"], + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + assert "Info type: EMAIL_ADDRESS" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_gcs_file_with_custom_info_types( + bucket, topic_id, subscription_id, capsys +): + out = "" + try: + dictionaries = ["gary@somedomain.com"] + regexes = ["\\(\\d{3}\\) \\d{3}-\\d{4}"] + + inspect_content.inspect_gcs_file( + GCLOUD_PROJECT, + bucket.name, + "test.txt", + topic_id, + subscription_id, + [], + custom_dictionaries=dictionaries, + custom_regexes=regexes, + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + + assert "Info type: EMAIL_ADDRESS" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_gcs_file_no_results(bucket, topic_id, subscription_id, capsys): + out = "" + try: + inspect_content.inspect_gcs_file( + GCLOUD_PROJECT, + bucket.name, + "harmless.txt", + topic_id, + subscription_id, + ["EMAIL_ADDRESS", "PHONE_NUMBER"], + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + + assert "No findings" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_gcs_image_file(bucket, topic_id, subscription_id, capsys): + out = "" + try: + inspect_content.inspect_gcs_file( + GCLOUD_PROJECT, + bucket.name, + "test.png", + topic_id, + subscription_id, + ["EMAIL_ADDRESS", "PHONE_NUMBER"], + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + assert "Info type: EMAIL_ADDRESS" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_gcs_multiple_files(bucket, topic_id, subscription_id, capsys): + out = "" + try: + inspect_content.inspect_gcs_file( + GCLOUD_PROJECT, + bucket.name, + "*", + topic_id, + subscription_id, + ["EMAIL_ADDRESS", "PHONE_NUMBER"], + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + + assert "Info type: EMAIL_ADDRESS" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_datastore(datastore_project, topic_id, subscription_id, capsys): + out = "" + try: + inspect_content.inspect_datastore( + GCLOUD_PROJECT, + datastore_project, + DATASTORE_KIND, + topic_id, + subscription_id, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + assert "Info type: EMAIL_ADDRESS" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_datastore_no_results( + datastore_project, topic_id, subscription_id, capsys +): + out = "" + try: + inspect_content.inspect_datastore( + GCLOUD_PROJECT, + datastore_project, + DATASTORE_KIND, + topic_id, + subscription_id, + ["PHONE_NUMBER"], + timeout=TIMEOUT, + ) + + out, _ = capsys.readouterr() + assert "No findings" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) + + +@pytest.mark.skip(reason="Table not found error. Should be inspected.") +@pytest.mark.flaky(max_runs=2, min_passes=1) +def test_inspect_bigquery(bigquery_project, topic_id, subscription_id, capsys): + out = "" + try: + inspect_content.inspect_bigquery( + GCLOUD_PROJECT, + bigquery_project, + BIGQUERY_DATASET_ID, + BIGQUERY_TABLE_ID, + topic_id, + subscription_id, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + timeout=1, + ) + + out, _ = capsys.readouterr() + assert "Inspection operation started" in out + assert "Job name:" in out + finally: + delete_dlp_job(out) diff --git a/dlp/snippets/jobs.py b/dlp/snippets/jobs.py new file mode 100644 index 000000000000..4fcb2d13b4be --- /dev/null +++ b/dlp/snippets/jobs.py @@ -0,0 +1,160 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app to list and delete DLP jobs using the Data Loss Prevent API. """ + +from __future__ import print_function + +import argparse + + +# [START dlp_list_jobs] +def list_dlp_jobs(project, filter_string=None, job_type=None): + """Uses the Data Loss Prevention API to lists DLP jobs that match the + specified filter in the request. + Args: + project: The project id to use as a parent resource. + filter: (Optional) Allows filtering. + Supported syntax: + * Filter expressions are made up of one or more restrictions. + * Restrictions can be combined by 'AND' or 'OR' logical operators. + A sequence of restrictions implicitly uses 'AND'. + * A restriction has the form of ' '. + * Supported fields/values for inspect jobs: + - `state` - PENDING|RUNNING|CANCELED|FINISHED|FAILED + - `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY + - `trigger_name` - The resource name of the trigger that + created job. + * Supported fields for risk analysis jobs: + - `state` - RUNNING|CANCELED|FINISHED|FAILED + * The operator must be '=' or '!='. + Examples: + * inspected_storage = cloud_storage AND state = done + * inspected_storage = cloud_storage OR inspected_storage = bigquery + * inspected_storage = cloud_storage AND + (state = done OR state = canceled) + type: (Optional) The type of job. Defaults to 'INSPECT'. + Choices: + DLP_JOB_TYPE_UNSPECIFIED + INSPECT_JOB: The job inspected content for sensitive data. + RISK_ANALYSIS_JOB: The job executed a Risk Analysis computation. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Job type dictionary + job_type_to_int = { + "DLP_JOB_TYPE_UNSPECIFIED": google.cloud.dlp.DlpJobType.DLP_JOB_TYPE_UNSPECIFIED, + "INSPECT_JOB": google.cloud.dlp.DlpJobType.INSPECT_JOB, + "RISK_ANALYSIS_JOB": google.cloud.dlp.DlpJobType.RISK_ANALYSIS_JOB, + } + # If job type is specified, convert job type to number through enums. + if job_type: + job_type = job_type_to_int[job_type] + + # Call the API to get a list of jobs. + response = dlp.list_dlp_jobs( + request={"parent": parent, "filter": filter_string, "type_": job_type} + ) + + # Iterate over results. + for job in response: + print("Job: %s; status: %s" % (job.name, job.state.name)) + + +# [END dlp_list_jobs] + + +# [START dlp_delete_job] +def delete_dlp_job(project, job_name): + """Uses the Data Loss Prevention API to delete a long-running DLP job. + Args: + project: The project id to use as a parent resource. + job_name: The name of the DlpJob resource to be deleted. + + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library. + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id and job name into a full resource id. + name = f"projects/{project}/dlpJobs/{job_name}" + + # Call the API to delete job. + dlp.delete_dlp_job(request={"name": name}) + + print("Successfully deleted %s" % job_name) + + +# [END dlp_delete_job] + + +if __name__ == "__main__": + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="content", help="Select how to submit content to the API." + ) + subparsers.required = True + + list_parser = subparsers.add_parser( + "list", + help="List Data Loss Prevention API jobs corresponding to a given " "filter.", + ) + list_parser.add_argument( + "project", help="The project id to use as a parent resource." + ) + list_parser.add_argument( + "-f", + "--filter", + help="Filter expressions are made up of one or more restrictions.", + ) + list_parser.add_argument( + "-t", + "--type", + choices=["DLP_JOB_TYPE_UNSPECIFIED", "INSPECT_JOB", "RISK_ANALYSIS_JOB"], + help='The type of job. API defaults to "INSPECT"', + ) + + delete_parser = subparsers.add_parser( + "delete", help="Delete results of a Data Loss Prevention API job." + ) + delete_parser.add_argument( + "project", help="The project id to use as a parent resource." + ) + delete_parser.add_argument( + "job_name", + help="The name of the DlpJob resource to be deleted. " "Example: X-#####", + ) + + args = parser.parse_args() + + if args.content == "list": + list_dlp_jobs(args.project, filter_string=args.filter, job_type=args.type) + elif args.content == "delete": + delete_dlp_job(args.project, args.job_name) diff --git a/dlp/snippets/jobs_test.py b/dlp/snippets/jobs_test.py new file mode 100644 index 000000000000..22ec36460fce --- /dev/null +++ b/dlp/snippets/jobs_test.py @@ -0,0 +1,91 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import uuid + +import pytest + +import jobs + +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") +TEST_COLUMN_NAME = "zip_code" +TEST_TABLE_PROJECT_ID = "bigquery-public-data" +TEST_DATASET_ID = "san_francisco" +TEST_TABLE_ID = "bikeshare_trips" +test_job_id = "test-job-{}".format(uuid.uuid4()) + + +@pytest.fixture(scope="module") +def test_job_name(): + import google.cloud.dlp + + dlp = google.cloud.dlp_v2.DlpServiceClient() + + parent = f"projects/{GCLOUD_PROJECT}" + + # Construct job request + risk_job = { + "privacy_metric": { + "categorical_stats_config": {"field": {"name": TEST_COLUMN_NAME}} + }, + "source_table": { + "project_id": TEST_TABLE_PROJECT_ID, + "dataset_id": TEST_DATASET_ID, + "table_id": TEST_TABLE_ID, + }, + } + + response = dlp.create_dlp_job( + request={"parent": parent, "risk_job": risk_job, "job_id": test_job_id} + ) + full_path = response.name + # API expects only job name, not full project path + job_name = full_path[full_path.rfind("/") + 1 :] + yield job_name + + # clean up job if not deleted + try: + dlp.delete_dlp_job(request={"name": full_path}) + except google.api_core.exceptions.NotFound: + print("Issue during teardown, missing job") + + +def test_list_dlp_jobs(test_job_name, capsys): + jobs.list_dlp_jobs(GCLOUD_PROJECT) + + out, _ = capsys.readouterr() + assert test_job_name not in out + + +def test_list_dlp_jobs_with_filter(test_job_name, capsys): + jobs.list_dlp_jobs( + GCLOUD_PROJECT, + filter_string="state=RUNNING OR state=DONE", + job_type="RISK_ANALYSIS_JOB", + ) + + out, _ = capsys.readouterr() + assert test_job_name in out + + +def test_list_dlp_jobs_with_job_type(test_job_name, capsys): + jobs.list_dlp_jobs(GCLOUD_PROJECT, job_type="INSPECT_JOB") + + out, _ = capsys.readouterr() + assert test_job_name not in out # job created is a risk analysis job + + +def test_delete_dlp_job(test_job_name, capsys): + jobs.delete_dlp_job(GCLOUD_PROJECT, test_job_name) diff --git a/dlp/snippets/metadata.py b/dlp/snippets/metadata.py new file mode 100644 index 000000000000..d5709eeb8156 --- /dev/null +++ b/dlp/snippets/metadata.py @@ -0,0 +1,72 @@ +# -*- coding: utf-8 -*- +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that queries the Data Loss Prevention API for supported +categories and info types.""" + +from __future__ import print_function + +import argparse + + +# [START dlp_list_info_types] +def list_info_types(language_code=None, result_filter=None): + """List types of sensitive information within a category. + Args: + language_code: The BCP-47 language code to use, e.g. 'en-US'. + filter: An optional filter to only return info types supported by + certain parts of the API. Defaults to "supported_by=INSPECT". + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Make the API call. + response = dlp.list_info_types( + request={"parent": language_code, "filter": result_filter} + ) + + # Print the results to the console. + print("Info types:") + for info_type in response.info_types: + print( + "{name}: {display_name}".format( + name=info_type.name, display_name=info_type.display_name + ) + ) + + +# [END dlp_list_info_types] + + +if __name__ == "__main__": + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument( + "--language_code", + help="The BCP-47 language code to use, e.g. 'en-US'.", + ) + parser.add_argument( + "--filter", + help="An optional filter to only return info types supported by " + 'certain parts of the API. Defaults to "supported_by=INSPECT".', + ) + + args = parser.parse_args() + + list_info_types(language_code=args.language_code, result_filter=args.filter) diff --git a/dlp/snippets/metadata_test.py b/dlp/snippets/metadata_test.py new file mode 100644 index 000000000000..c06440cd3cb0 --- /dev/null +++ b/dlp/snippets/metadata_test.py @@ -0,0 +1,22 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import metadata + + +def test_fetch_info_types(capsys): + metadata.list_info_types() + + out, _ = capsys.readouterr() + assert "EMAIL_ADDRESS" in out diff --git a/dlp/snippets/noxfile_config.py b/dlp/snippets/noxfile_config.py new file mode 100644 index 000000000000..1c2d85d16597 --- /dev/null +++ b/dlp/snippets/noxfile_config.py @@ -0,0 +1,42 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Default TEST_CONFIG_OVERRIDE for python repos. + +# You can copy this file into your directory, then it will be imported from +# the noxfile.py. + +# The source of truth: +# https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/noxfile_config.py + +TEST_CONFIG_OVERRIDE = { + # You can opt out from the test for specific Python versions. + "ignored_versions": ["2.7", "3.6"], + # Old samples are opted out of enforcing Python type hints + # All new samples should feature them + "enforce_type_hints": False, + # An envvar key for determining the project id to use. Change it + # to 'BUILD_SPECIFIC_GCLOUD_PROJECT' if you want to opt in using a + # build specific Cloud project. You can also use your own string + # to use your own Cloud project. + "gcloud_project_env": "GOOGLE_CLOUD_PROJECT", + # "gcloud_project_env": "BUILD_SPECIFIC_GCLOUD_PROJECT", + # If you need to use a specific version of pip, + # change pip_version_override to the string representation + # of the version number, for example, "20.2.4" + "pip_version_override": None, + # A dictionary you want to inject into your test. Don't put any + # secrets here. These values will override predefined values. + "envs": {}, +} diff --git a/dlp/snippets/quickstart.py b/dlp/snippets/quickstart.py new file mode 100644 index 000000000000..090b5bcc6324 --- /dev/null +++ b/dlp/snippets/quickstart.py @@ -0,0 +1,92 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that queries the Data Loss Prevention API for supported +categories and info types.""" + +from __future__ import print_function + +import argparse +import sys + + +def quickstart(project_id): + """Demonstrates use of the Data Loss Prevention API client library.""" + + # [START dlp_quickstart] + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp_client = google.cloud.dlp_v2.DlpServiceClient() + + # The string to inspect + content = "Robert Frost" + + # Construct the item to inspect. + item = {"value": content} + + # The info types to search for in the content. Required. + info_types = [{"name": "FIRST_NAME"}, {"name": "LAST_NAME"}] + + # The minimum likelihood to constitute a match. Optional. + min_likelihood = google.cloud.dlp_v2.Likelihood.LIKELIHOOD_UNSPECIFIED + + # The maximum number of findings to report (0 = server maximum). Optional. + max_findings = 0 + + # Whether to include the matching string in the results. Optional. + include_quote = True + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "min_likelihood": min_likelihood, + "include_quote": include_quote, + "limits": {"max_findings_per_request": max_findings}, + } + + # Convert the project id into a full resource id. + parent = f"projects/{project_id}" + + # Call the API. + response = dlp_client.inspect_content( + request={"parent": parent, "inspect_config": inspect_config, "item": item} + ) + + # Print out the results. + if response.result.findings: + for finding in response.result.findings: + try: + print("Quote: {}".format(finding.quote)) + except AttributeError: + pass + print("Info type: {}".format(finding.info_type.name)) + # Convert likelihood value to string respresentation. + likelihood = finding.likelihood.name + print("Likelihood: {}".format(likelihood)) + else: + print("No findings.") + # [END dlp_quickstart] + + +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument("project_id", help="Enter your GCP project id.", type=str) + args = parser.parse_args() + if len(sys.argv) == 1: + parser.print_usage() + sys.exit(1) + quickstart(args.project_id) diff --git a/dlp/snippets/quickstart_test.py b/dlp/snippets/quickstart_test.py new file mode 100644 index 000000000000..dc9f91a583d4 --- /dev/null +++ b/dlp/snippets/quickstart_test.py @@ -0,0 +1,27 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +import quickstart + +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") + + +def test_quickstart(capsys): + quickstart.quickstart(GCLOUD_PROJECT) + + out, _ = capsys.readouterr() + assert "FIRST_NAME" in out + assert "LAST_NAME" in out diff --git a/dlp/snippets/redact.py b/dlp/snippets/redact.py new file mode 100644 index 000000000000..09713c7217c6 --- /dev/null +++ b/dlp/snippets/redact.py @@ -0,0 +1,259 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that uses the Data Loss Prevent API to redact the contents of +an image file.""" + +from __future__ import print_function + +import argparse + +# [START dlp_redact_image] +import mimetypes + +# [END dlp_redact_image] +import os + +# [START dlp_redact_image] + + +def redact_image( + project, + filename, + output_filename, + info_types, + min_likelihood=None, + mime_type=None, +): + """Uses the Data Loss Prevention API to redact protected data in an image. + Args: + project: The Google Cloud project id to use as a parent resource. + filename: The path to the file to inspect. + output_filename: The path to which the redacted image will be written. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + mime_type: The MIME type of the file. If not specified, the type is + inferred via the Python standard library's mimetypes module. + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + info_types = [{"name": info_type} for info_type in info_types] + + # Prepare image_redaction_configs, a list of dictionaries. Each dictionary + # contains an info_type and optionally the color used for the replacement. + # The color is omitted in this sample, so the default (black) will be used. + image_redaction_configs = [] + + if info_types is not None: + for info_type in info_types: + image_redaction_configs.append({"info_type": info_type}) + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "min_likelihood": min_likelihood, + "info_types": info_types, + } + + # If mime_type is not specified, guess it from the filename. + if mime_type is None: + mime_guess = mimetypes.MimeTypes().guess_type(filename) + mime_type = mime_guess[0] or "application/octet-stream" + + # Select the content type index from the list of supported types. + supported_content_types = { + None: 0, # "Unspecified" + "image/jpeg": 1, + "image/bmp": 2, + "image/png": 3, + "image/svg": 4, + "text/plain": 5, + } + content_type_index = supported_content_types.get(mime_type, 0) + + # Construct the byte_item, containing the file's byte data. + with open(filename, mode="rb") as f: + byte_item = {"type_": content_type_index, "data": f.read()} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.redact_image( + request={ + "parent": parent, + "inspect_config": inspect_config, + "image_redaction_configs": image_redaction_configs, + "byte_item": byte_item, + } + ) + + # Write out the results. + with open(output_filename, mode="wb") as f: + f.write(response.redacted_image) + print( + "Wrote {byte_count} to {filename}".format( + byte_count=len(response.redacted_image), filename=output_filename + ) + ) + + +# [END dlp_redact_image] + +# [START dlp_redact_image_all_text] + + +def redact_image_all_text( + project, + filename, + output_filename, +): + """Uses the Data Loss Prevention API to redact all text in an image. + + Args: + project: The Google Cloud project id to use as a parent resource. + filename: The path to the file to inspect. + output_filename: The path to which the redacted image will be written. + + Returns: + None; the response from the API is printed to the terminal. + """ + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Construct the image_redaction_configs, indicating to DLP that all text in + # the input image should be redacted. + image_redaction_configs = [{"redact_all_text": True}] + + # Construct the byte_item, containing the file's byte data. + with open(filename, mode="rb") as f: + byte_item = {"type_": google.cloud.dlp_v2.FileType.IMAGE, "data": f.read()} + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.redact_image( + request={ + "parent": parent, + "image_redaction_configs": image_redaction_configs, + "byte_item": byte_item, + } + ) + + # Write out the results. + with open(output_filename, mode="wb") as f: + f.write(response.redacted_image) + + print( + "Wrote {byte_count} to {filename}".format( + byte_count=len(response.redacted_image), filename=output_filename + ) + ) + + +# [END dlp_redact_image_all_text] + +if __name__ == "__main__": + default_project = os.environ.get("GOOGLE_CLOUD_PROJECT") + + common_args_parser = argparse.ArgumentParser(add_help=False) + common_args_parser.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + common_args_parser.add_argument("filename", help="The path to the file to inspect.") + common_args_parser.add_argument( + "output_filename", + help="The path to which the redacted image will be written.", + ) + + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="content", help="Select which content should be redacted." + ) + subparsers.required = True + + info_types_parser = subparsers.add_parser( + "info_types", + help="Redact specific infoTypes from an image.", + parents=[common_args_parser], + ) + info_types_parser.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + info_types_parser.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + info_types_parser.add_argument( + "--mime_type", + help="The MIME type of the file. If not specified, the type is " + "inferred via the Python standard library's mimetypes module.", + ) + + all_text_parser = subparsers.add_parser( + "all_text", + help="Redact all text from an image. The MIME type of the file is " + "inferred via the Python standard library's mimetypes module.", + parents=[common_args_parser], + ) + + args = parser.parse_args() + + if args.content == "info_types": + redact_image( + args.project, + args.filename, + args.output_filename, + args.info_types, + min_likelihood=args.min_likelihood, + mime_type=args.mime_type, + ) + elif args.content == "all_text": + redact_image_all_text( + args.project, + args.filename, + args.output_filename, + ) diff --git a/dlp/snippets/redact_test.py b/dlp/snippets/redact_test.py new file mode 100644 index 000000000000..24ade2125456 --- /dev/null +++ b/dlp/snippets/redact_test.py @@ -0,0 +1,60 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import shutil +import tempfile + +import pytest + +import redact + +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") +RESOURCE_DIRECTORY = os.path.join(os.path.dirname(__file__), "resources") + + +@pytest.fixture(scope="module") +def tempdir(): + tempdir = tempfile.mkdtemp() + yield tempdir + shutil.rmtree(tempdir) + + +def test_redact_image_file(tempdir, capsys): + test_filepath = os.path.join(RESOURCE_DIRECTORY, "test.png") + output_filepath = os.path.join(tempdir, "redacted.png") + + redact.redact_image( + GCLOUD_PROJECT, + test_filepath, + output_filepath, + ["FIRST_NAME", "EMAIL_ADDRESS"], + ) + + out, _ = capsys.readouterr() + assert output_filepath in out + + +def test_redact_image_all_text(tempdir, capsys): + test_filepath = os.path.join(RESOURCE_DIRECTORY, "test.png") + output_filepath = os.path.join(tempdir, "redacted.png") + + redact.redact_image_all_text( + GCLOUD_PROJECT, + test_filepath, + output_filepath, + ) + + out, _ = capsys.readouterr() + assert output_filepath in out diff --git a/dlp/snippets/requirements-test.txt b/dlp/snippets/requirements-test.txt new file mode 100644 index 000000000000..3275b420e033 --- /dev/null +++ b/dlp/snippets/requirements-test.txt @@ -0,0 +1,4 @@ +backoff==2.2.1 +pytest==7.2.1 +flaky==3.7.0 +mock==5.0.1 diff --git a/dlp/snippets/requirements.txt b/dlp/snippets/requirements.txt new file mode 100644 index 000000000000..a8368832f620 --- /dev/null +++ b/dlp/snippets/requirements.txt @@ -0,0 +1,5 @@ +google-cloud-dlp==3.11.1 +google-cloud-storage==2.7.0 +google-cloud-pubsub==2.14.1 +google-cloud-datastore==2.13.2 +google-cloud-bigquery==3.6.0 diff --git a/dlp/snippets/resources/accounts.txt b/dlp/snippets/resources/accounts.txt new file mode 100644 index 000000000000..2763cd0ab820 --- /dev/null +++ b/dlp/snippets/resources/accounts.txt @@ -0,0 +1 @@ +My credit card number is 1234 5678 9012 3456, and my CVV is 789. \ No newline at end of file diff --git a/dlp/snippets/resources/dates.csv b/dlp/snippets/resources/dates.csv new file mode 100644 index 000000000000..056fccb328ea --- /dev/null +++ b/dlp/snippets/resources/dates.csv @@ -0,0 +1,5 @@ +name,birth_date,register_date,credit_card +Ann,01/01/1970,07/21/1996,4532908762519852 +James,03/06/1988,04/09/2001,4301261899725540 +Dan,08/14/1945,11/15/2011,4620761856015295 +Laura,11/03/1992,01/04/2017,4564981067258901 \ No newline at end of file diff --git a/dlp/snippets/resources/harmless.txt b/dlp/snippets/resources/harmless.txt new file mode 100644 index 000000000000..5666de37ab23 --- /dev/null +++ b/dlp/snippets/resources/harmless.txt @@ -0,0 +1 @@ +This file is mostly harmless. diff --git a/dlp/snippets/resources/test.png b/dlp/snippets/resources/test.png new file mode 100644 index 0000000000000000000000000000000000000000..8f32c825884261083b7d731676375303d49ca6f6 GIT binary patch literal 21438 zcmagE1yo)=(>96}ifeIqcZVA&?(XjH?(SY3ihGe_#oce*-QC@tL!Vc^?_cLX>+H4m zPIfXgu`4slBoXqmV(>87Fd!fx@Dk#}iXb2muAgyID9Fz*U4vm22nZa8g^-ZEgpd%S zyrZ3|g|!I?h8} z`UBB)&}x1|{16cluBnDvJOM{ePIjsBhe+n2%002#;Mw4Kc+ccI81<7K$1??yZ!kL8 zHO`|3F^0VkRuJeI4elc)FU2ABN1*J&bFb#g$ITfWWCSs}eRIQVf&ZPiS{>i_yzulv zU8bEK4i16>?Es_JHx&9v3erSQA(r+PBoFO4W`B22{1T+^SWp}ZgqXhDg1LgKn~K?* zu0A5-Iw%bSBz@Qvb_PT~;nH;9X<8qb3|fo_G?l^M9j8w>)0rJ(I}Az7%xof5Jsqyb zW7y4M`W>CcgqA!bi#^n&Sv=%)0%Om(=HM-7?{Om`iws|<7YV_#g^^P-fu&+4V00-D zMLMKO;0FpmXbpB>>XUY9`YTypMZ^s-}$z6O6lRvs`mp8pFHjlaTW%1q}!!1=u`oF>1!8KxIsxC1?;rZwh3Tr z-`iK4vriJKF^aiBXs;sicH>DE73)E<@ z2>4~hyXH$aC6RR!c<(o z>wY(6;vA?~LU%SUm7WzP1p~9_0KAw0<|X*MKXjjcq5l#g_~pv8)ytM%ZTj~vNWmYF z?p>mlSa;#6<4~I{*x&s5iMBzf(sHVtQ@&p3y^o}+-q(SaPA_?vijliRIPqt|U~i%_J_~-?&i=u_8_QyE zqaP#}oc_4E&d5RW>n%IaJ$j{4^SvoM`0n9p@J=#CQq~cj%BVAXBW*5D;kBb28KXnU zudYv3pQ0N56yOSB)ny5a$`dwc@Ou#p8vi7?WLg$ehfZ>s4s~EFZh28<#81Z?cLeTcgwG<}Ra0-uy|~m0Lb$YIq6O3nb)FE#RO>u) z!~wOZ2^g-k<8D9(Dair-PVijJ;t9UuLkD8E%w=fMAx*Iq3kv!Uln>PlL7xN{?ZVyP z0m%&bsviKth$ZZg`2)(dC$c2Sde8$w9V8{tP#$JJFh-wd5&GUAe3OwA!BPO66Olf^ zDi?1R3{hY2HZWBm1TMhfi-0&3d>)BrIGY#LDY(;6Ngv{YR@!Lb!1zRUm4+$X|MWO?+^x4yB_QOQZC-Ud{N&r|UH03VVt25tVKEz2j)Cv{HBPk~7Di!zO} ziAI>x9&MkhLSeC7zRF%FPt71LUy`Z7UD1#dE2$`HEQus3Dk&_fF)}hTG}1Ow3GFFT z>Kg|QzEWGo;_t`!GST|NXN3}_{@J)Wr$^?<*#Hd z3BMJ?QPeDI6hjna6icRQOdw29O$heVharadhAEP&XdcQbf2EZ@mR75vmn#3tRBbL` z{w1kauNEUerm9oqDSsDv49k}AvsBX`TkW^FP24g>JwCT6NB+wc*R9EI`)$;%u1kJP zx@Wj&sAuW3!5#Y@C_GzC1hxaV6B{+_xVbYEV<;6#aD2adFXwpE*dwc~Tjm7kdQxm03_M!tvgP0Bt6U9qaaYVkbxZ_VFg%S{bM_sVBn%RF@qmJe}i1Q$%% zEFH$LS62@%@_15Nlvz*QUe1~>kS=%5LC#Ljjfc9EXA4G$HMh*S?1x!%Co?4{UPm`~ z9EUkGA3>$vw+5z694r~>;E>#q-H?VsYmhdOy`iR|HK8G)V(6qynwCc3-Q)E+)QqWQse#_IC(R9qYmLpgN)@RgrwM;+9!p{u=$v29Zi&s(% za7?w#wX9w&1FwP$p-;%`q#rF0j8jb-7tRCPf4&*N2)=l}a3G{0;D*73WyG=qzXSVY zU1F;!G-Y;WR++9UQP(UYXBPj3nkB1LwbaG{j+NHw z7wD1jev>mJ-iMmYp-Zmao8g6VwL`DrhxVM-4Z%)Pzfu0d&c05%?{tLh`c_>#-+R02 zx{kX72upIG1Y){_Hzzk;y4?hwg*b^+h`X8YFNezAa3dH{mt`2S#qm+os{!V+Q9pKFq`1NMmC{ zG#oSPuaR*Wc9_{I+g=C008{(j$fU*9)9mRKc;a)^Q-viXrIu4!IqCG52Q1oWvWhX} zI(d7o2UfAvOf4rye|ngvT+`lHpbiD^KJEq$BiVrs=fyNS;p&pQ>_OErF z?RZ=dyH6!*^oN&c(y^ zQukfv2bFpDZw{~X(^%Z{%QEk0!8pG0LNN@2iE)tcr<3J13iHD8hU zFfIot*-@1&nzR+}3CHzej|o^XSl{%xiGxu)P5o;9qrmeJK3F#fLG&V8OHJ##CUb|2 zgj}+(DT*nk^l$BxmDLrOYqgIicOoq!Qjwm%Fwdne>ZR)H-e%3f>nxf}v{y768ay>y zji>rxEyw!V%DT4O8|v}0a{iT%wx@&mxzh5LdCsb(nv^Eh>ic`{3zx6M$|Eqtp7U}V zdVd0%^Nf32WB#z~Qst<3IH8&(x+^X0SC6@9MK@NgU3*wP&ugJ|poujeS!*?)y}6#> zkKy0jH|5_qZ(EIf*`{`>$~`2zlNMa(i+Dcn}QDx>;t|(vOO)V0EOZ>vg~;s zb_<7wY)TGGBrSjZ^k4(8KdRSpiEzOyp~$f4j+?C%m^x92zI3@5EIm(&xNGuyK}yhQGCS5LR>&Mm*4>9HRf3$`H}$4z)%FXvfD zZY}4I7adKhE*E!iuP?obDF9Lctw-VYuh*LKonb%q*B$dzr-gLekMntoDLMRGdrw_H zG~TyWt=s7Pir41%n=%Xp2JC0Bm*tPNd$Eg=%+%hue!sH!=CkCd@x63#^!2kM5LS=8rx@Uwjs4jM&6={Z()%!wIPco>&0UKlF`!6rf_}n&2+d7bf@6Y!`gI&f5=G)0+?8vzzUq zoecI~$CmyaWbm?h-7ozqnd#i{F2H#%L7s#%|H2A}4I1Mw`kf^A|Ne^s50+JD@Q`0h zGxqEMo9e$ZE(vqtTQ!mX|D=A>cZr1hv!Ci3qZKdBeb$4X?u(*XO0Ku4pIqKV)@nfX zX$b=zNYiu+Pt3r6qi;;o$kBdtqPwL#yK9RJD-i;*Puc&y!l_7L%hlygwQHpXHTvER zo3wBIZ$$Gd=L(rYdA4?BMfVN{cA_$NvCB!)#ti^WlO!j>aPoA#%mVEe_LtwWt!Q^UapEp z!!2fp;Pku6UCBK3olXo05z@$GcDj~-_kx~5#Kuw;UsvrmzdGLu6o*PZgNidZSaF1U zsQR)9af;AQA#xKY(&Ss7x>U;t@|Ac}c{qHlTXc?)rN~`G?1e@$yB{ib@y0e?!wJbQLT2DYome^{!80)@n5J)_5$dsoSU=s(5T3i^ z?A>~PBSC*%9@j6#nHwe7=53`R z%$okJ%m5KsX8ZnTC*72Z`dmYiQj@1e^=9W~LOqRW6iou2#8k)LiyR>npK=$lbK3+X zFkdzc(rHhfIMOn|`xhX8NB2gDrNGWr?_zE9w~KxYKy0xz?uheq=i+n5cJ6gn3fEUI z)UcftoP^1Kn#tQ(SUnEE=wl9Q1#J&`QjA?{mPRBc(e>D*^7bZ1?D49V%cGq8yzM2! zwgv2T+8WoCn?YP-PTOv_4Tqix`r9hJ7_s&45#^1oWecQx`iPYq4*Y>=J{^FG3!Cl0 zZ&Zdg6{u9_Y(%7>`@!`o`)`T-=ZcJvM0rK!k*BQhiOc-#{}jL31Z=JjP(br#LKh= zV=&Z4)vdxPj~<`wwY+g+8P?u?xgaj@g&nF)$yuDd2ybfUk?A@C;tNG0CO{qwQG^`| ztz{)#jntp~;H|^JO6z&m{=KNuh7?5u&YPh!Wq>=nbV+Yoq=OcUb>9U4qN@?>(ZAIm zkL73O+8z*nz7AU2?h6iS#*ZU_e?TFjpPpsx?W&HQ1JO?jcHa|IFuK@un|u(Td~cUm zT2C<0l-Haxus_KV>W6KU{`yTPc%};^b2W=uzV?m-bN){`e`$aEpFPU#^o$^C*sNeO zw-VX!{jgD!3>xyrq>{3;Nqa`F7Unu+RDY1PRHFs#hxmecMLGquMP3q^+ejIn*r(UO zpSekH*KJ7E3Zk@4PQ_k~R%W4x;=dJOO zxEB)vgn(Ou!9pH&2l8 zh`M?cr+5;q8s#a2rK`vvBKM$;6$45B!pf*fEvc-& z@awdX5xW6ikS9asO^ZF{>gE^7jA+BvBXH|9jptpzvb!s=v>{p9_of%W#m$ilYVFp- z`C+TaNu!UfYaoi*!d?Jhb76g>m(+?dcpXGw$aURfv(1i zEb4Jn1uHBN?)EhR2lSW5FY)IGa8ifkFEBo}wqohs;t16(6>{uFX)N5TcT5$7bHkR` z(UZHmG9m-&e|k2eeg7^a4=KDzCpf<(hm0Ofx)X=ji=BMk1xHb-r9ca#lX=H6yL-sU zZhNU{ghbaNsfkQRd;AmT^D+O1IeBm8koSmgVcalk%5BRGtb_9%riG!Dlz6?tLatMgtZ5qCo$@0bK^ zxIhYupHtJ2tCt1cO>WkZp1ON#@&T&Mfe4MPf!BB4hKP10i@m%)RqPjeteq+Y?SNkr6IHi{4>e+Sn&o9+M z1f4T{ONb)iSR}bVUB6|XudM1TDN(Y>zad25jYcwVF`0mg9>mx&8~2C+0KY!o1fED) z)6feJ4Or&dkzb?p1uOV8sfbp}U0!xY1V9>IhtqqNNHVgvY9j2EQcz#@^FQ=W?2n`H zM(vn?$MA0%q`&=gL|k4CJ6aie#ha`;W5cF8<6^%isq&&IasU=37dECV(t23E>6{x# zW6Q?;oT)ZEck%8_@+P5Vm9nwN%`?e;sZY12SE z=P2ij&172AuIc59{58r!!rn42acr{a*=nh+nDgRGQRD%4wqJh=c(}gK*?qvgTe=u- zI5nrK*`iq%@QZ$YS=7LLj4r{txG5}U$mRZUWy$z?P|Mh=o%sXs4(!FW@+9Ob#pC>i z7tUnAW)Wp?-bdDbX)=L8c1ej%2;JI9jqdeu^kHAs)f zB#EsTcpOMXK4H!iZQ-^Dvy~#89iW+?NHTSW+Qd5X*ig>v${>Tqxlz=SwWCQNAA zTbb!GbMGCk(*ShJz=CIA03c3eV^*@5JGpeLbnJn}%g&oTV_Yx)fu2g*I=p+c%^k~d zxWAP+Ro6pfrZ-^RW13^aet&#!A8nm6%~I7%y*^9WL9^SdKAJHd6t1btutn#D8*%aMdT$40Ez z@2$JS%=IH%fFs{(VVv{kV4=n~%A8eSLvu(DZ;-`im@%Q3vc4)JXQC$g?vZ3-l8r;u_Ov2(Y; zMuRaQ-cjQSTj|^2m|V`G0&j-|gDs+yD}QpIp6Emg(OCy$;reA`yRGstrlQ80Q%yLQ zc<%ZyS~qV4i~8$pwHQr*3|Js#kU8#9mdR{V8|WqiOy+!r{v7lU@!W3pePn^jF5_BH zdMx6OW_hDB+n&VXDEj)-w7L{u9ni|=ywmVidZe-s!>&=J1(IYfV#)DJiIqry&%s~k zyMw!OH2)f-b-T4KDE;F!dIW_Q+O5>kTP;KmuqzL-soF9ZvbZUa##gqSO3mGUOo{#` z9|BIfe&L3hga)_?g4@wu9OpUc_k(ax_f| zx9-M-b#16X;o6Xo?qb(8Qq2xl=ZSsu_)k9y~^SRW#TrsoZUtclP0R z?IRaMyUd7pdxztA1G{sxWJP&s8QP-L94jEnp6^*Qd+g9m zn7Y0XBwL26;XSp3KSuZyz4C0TnFn8g3o9Htn~0my(88aq>8gu@IJkh5fvn$RSADy< zr_tRnlVxz%?de6kItSulQ4e(Q;i*-K`ovUhs*r?CboEtr(3CG@44a2X4=zbF)juFx z^)w?*4lGDWxxNJ^Z>j7v{j_pB#y#D*hGh{oueeQ~${VXGvy9YV+#}u-Ti34iSb%{6 z64H0aL8hwDFq_ow{-%bziI7*CLs8aAm7`IM_9blURyDu)R%<}gzpZ0}9L4d?&hdRT ziZulk|P9&2K6J(aJ>(PXMsl{f!=G&c6IxS@)&}W0T}a#mxRpXz`2>&d<

zue1vAOc#_na3SBR;0LJ#-BcbKb(M$m6zElUS_R_pLON8BNgkcAIMrg}%I=FI5oq+9 ziQgDiJ@!vzl66ZW5hgOSKG*!jQ^RPI2q)%ki*9Ganldw4=t4S>#Wc~?lW5A(L85Dm z@n$MJA+LE@<8qgyNf81E&(Z`{^mb0&bz-@Xh}7g$_ns*&_j)xAw&jv28Hq&!id7s6y9Juzvcw3jiZXeDf$U*g z9_6Gg+7FqXd$hD_+R1Q2$~aIvh;^v?f>s_MfiR!(@gahkR8GKS2#M42@;i=ytb%N(Jd1^a~H_6+*?qZ$A&k>;0s+eGQF!jxfS zUaCgY?|z!5BXG!N$5HS|XvOjUA;rt0&II?l{unNDJAdp}~SP`%s)o@>7QD93@Xl z=t(#%iKVSoCdW^dW!4JJ%+&?K`9Y{k?08-vEbms%F7A$Fplpw;trLM8)P?H6^ncHb znV%Bl-e(%~-D<1Sz+T#Ews=9&6xn&(>}^9#Nc8t>M&U#*SRfw}x{LexG5j1^ru`Mh z>NUHHmcQ8-xlrd1b0yKm23)U>2=j89Pgx0)uTOuOw4ok7_OgbbL-;c+$!-w9b!5Pj z|83Iac}B!<;2I!kKpFVZ3y{IE9Nsr-=ohmI{jJjQOGX;-{JR`$BT+ zM76DuN#W+xG;!j#C$A>1g0#Z){WsaHY<{pqs49$!vzT@4VEr;q|u?c*kDcwd2LE4{&H)2$e4;NlziH%JAJk z8{R&SwE?&?n9qf!X3We>R6T7T&O0t?@)i0h!^({26EhgSHre>%)|{FutlUU(vU`b+ zrv`Jdf>dowiZ-cM)iUo@XWx#5GM|qx4n5CY#LpZ^JY65K4-DCSv7bboq>B~{w8j<- zj)mX`rDih9C^N^ed8eumCCUS1dCowUd2_Imcmucm z9RfF3yo>I;Uh0y0IB6kyA(_6SkD0#8bFDC~!P5?6Qvfv;zcl;D$yI+uCya96LE-4P z(|i7XOlzfyTgQTZJAd$qIBTmJhgpIT*m2$X=nT)2e7DGnbHkb(sr3&STK0_QX^`q32QZy#oi9C z(DKK*MPDBcYVITrsWWui=O;%cdX0CB9oTB`_bhj#bMcBqI4DTJvN(xviUrG1rO5gC zG2ZC{=Q7LI!;A8z?FO)ra#XDN8DeghklE@0SYAFqv?X->XH&rtihMYA97Dq~`fB`% zTxvTgm~s)jA^TgSyu_beU!BfdA_OsWAh`7ca1`xCbF#O^BK7Z(qH}om?F5gU3*=(u z(z5kE{=7icHLIoAIWbU#z+mpeqIEq$hYwfd&~i{%a?pvziX5UjIL?L+3^*Oq`Me^R z>Mmv_Dw$1(MS1uNSyRS2m#5YLjtRSZJ{NjCS+d3CzSf;mUz9Z(mv48EKqZh;h`+9Zm-g}M*D}g|P6fBrC@!CZBla~n4IYouesKJMJ{ws)3W%X6F}0<+A|92Bn@V< z&Wde-Zy0OouMa<)03$R2kX_alGPSvqyXL?(b*qB5(<=|AI z!Dvnk1^&}~38wrpe<3h{YsfPj_Fim8cX7)X_A0d2X6GQZPUmr9otGyKj6RWrbU!86 zYzLZuT4d25W>>RT^?Y$0*VLAfp2hAl(W623^Ut=c(>C9;7cZ~eHHro&mUx%`0GGs3 z46Q|wa~WFlzha|&=`KpGAXm6v;9SC)Wcg)(RTkKPn0QnTZ;p@0 zITZ~(=&sru_f6}E;TzyI&D^3o`K11yYZ-af(99KlLOqUp7|mbqpQ+&6Wz70$=AAjg zRR=mxOF~J95vQp7NWlh?DAd#aOEaR=-ifkxA{fmJ{B#gnT z;_lZmFK8puo%t>>h3qL+MeKmgax#2Dfl?|hEqCgg5D1dIBO)frfX@Bt2`|3HdbNb$ zfrpEU40`O81h%a{8n% zz|R-3)GWjh&UU=`M&?gV`|-b<#dW%38jX?3>1|_aS}0=m1f|{a$|+a&sCz&J_gQMq z=mMDN8T7RifQ9UlQykKfQ$X?NM5_DSmG{M&Q~avb@EXsziPa&*NwIvu4T(Z;F;?1b zH26#0lf{)Y=Gx8M&57kas?q02Piv9lR`7ZMaUAgp`@~^Ss9n&jI@(AR?Z^)mpKJyn zmAS`ClpOIB+Du%?0+KJGb{`Cs5NY>fRM1bAXFLCO%mKE?IB5{Kg(Yhz!Rn1U5&uqXgTdVX2}>eg z7DB3$_n&T-xK7m}jBJDhAN6}oK`@u%k zy|zfRdNx=V5^AXYT|fU$uiRyw2RPkJq^Yeg+1^lRFZL|GW=hARp@t`AKhb46rd0%x zd-^$HCZMgxX$a?~ozvau=2gU#ev{n5QbfqsD1N}DKELQ$x6UMj)eoI}YtDOI>(etiic)uJ(4HK-ByvT}M+-R!i6 z|Jh)qw@%rfG17b)$Jlneadyw1SZHnz0~wLmBT-tiy)<$&su13T$emJDSn;MbHa-Ba#>pvx$Ua&hO1V4k2-(#1G;l^Do)EmPD=i3f?P9uLJJ514=&39p6>ZP3 z0?JuX8uRfvVWf|LXD_p4tdj(jqY&`qmWB%nYeIyRf=zY5`$g|`$bB|-HT!puFJ;+s z+F*DS9+34quoBV0+4NB*-e5fXMxvQ}MRsem5^vE(pWqlV2g<52GGYo!7iAcw;7VnD z;7mREtthc6W_xfX;~8eXiLBGe#U@O4`B7wWDEHUa!8r2K$W%LU>|5jDqW)+`#vcm~ z@;I$%RHEb91F5cK0JbYc6@OGs#09D#mg`fkdKOE8>5^8zz5Tq-(vEUpmsF9#jJC9{ zi##D!qrtPWx$XQww#saUSf~0~OHv{L9VY#S-nc5vC&X=Ld~6=4q*iTp2vxp*he_|$ zmjGD}=cjA~6SCmKId31HwN8$~ncM|D)B>rs@HWwo;LiJ*d%-!Zb93$2&^Usi?9ih4 zY@G3l?q?^_{Ni8yratZ1K()sdX=af*j84_V9V$Axu=euj16llN2-sp`pUttRd&{ba z*aMzc1`>mclmXln;U9UJcjy74;SGoJY)&MyUZ{8SoYyu#fH?;&(+=r-iImS}Wmh_E zd~hVKj)UgvZ;#wjYzg%-Fy*?fgSgk5Puxsc4JgHH-ytOmbjA-Q!uvO zN~eoTqaCI1J%e5I2$dQJ zqY0|8o7WgFoJ;owO&-Y#?e{fp;B}ceoq~DS*MP`h5JZ%ep@@Z0w}v*~@}{tDeih1C zIvKnyy=1Iq{pn2SJvd$B{j;PZ_db};YqrnP`7~~)+xhf)l`_+7J>9_lQR{i%)|rb5 zyT)LH*|~M^YA`NLZ#fq9OiSw<^#R;KS}h`y?G#p#R!+n8sn5vju2lVE&+}@Lf5w4Q46HaY= zj5!U5x^dQZ4{XxJ-i1=1e(H{hjlgd&vQ(hle5jdtTNkrPuT#-kC)r7=e3;6jbY*LUY$g+`DeC1k@wycL_1vn8$8`9osDozR!cG z#$weV8XV`iX}mc>Y3<*3H`Q{ZgKzqP$;Xv^)rs(UnJMqJ-)diMI1LMj?-aJ>fz;Iq zd}B(d&8$Wlv1=vF?v(1Ba=E6a#rJGLOO0C?v1scU9PW$U=IVOkYZcx?U%Ry4BX&-h zfv_wgW94h9|4r4o7Tb+y0+!q0h!q->6(&?u8>Y3pbN?h~;2ei=MQg=TptcdAihCun z+rC>bLYa^l<8GZQ50z1T|9Lfcs)a7|yU~kHO-+`f3Y`WNT>H$=o!Pc$kmnCt>c*{} z)?ee1zqzGCV_;Dw_+v6_U>YBIdD!*Dv3Ui2g<*;&9FF*#wQrc{0wDSNUj=;RElF++ zX}4;NXRhSIuvl~~MXqlzc5=|{ShMyk3e2MAxpLmtu3saK8-@m3c%J{jEo=@^L+^5t%Zqz!M>`2v? z2;ld0AxN4d8;@^`#qeFhs3PFMragMZVxaYrnrUrnJw#WY^bV`K6JiBE1+W8=pF+K4 z`eU2lh)kuutx-2#=Ub`TgG(L6k_i+9FZ|jz0)LK^KyxG02A z!ntcuj{Euqt4_e{K=5cWa^I72hJyWJQMA9XuI6*hcd&fLGy0NImeX&-@zz6J1r-U4 zmz_i3L|q(1YQ2URBBK$Fgy9WJlE&PFsHTcKChvoH{C;OAuzpu%GBVHEW=0h%9Z7wn zLsIfMCEP5WLVkGJvis5A?!+O{n@qE3UZFI`|B1H8k zv=oP>7hgA`)wYQ+La`8mh8&^)6h_74Ky+b(qyDXgV{!1c_erSQb?r^Rh3T|l>{p&O zi{yZd*Ux;TRI#Rhq_(yj-@4S6wNRb98unpVBN@3WX;q=GOm_)53iwLTM-DQt2#uVu zuzCz)a?OeJb+B%6^hIk;P1jd?VgBz*W_eXAgo^LGG1p75TRjPilETbY^gdMO7#1CXn&1bJw|V;ctv*C9 z>xy6zcv~Vvx^cx&#DobE7(m$Y>9HM!)yN>#pM^vf>#uAJq|~y{&IYIaHoDW5WA-x9 zHcfsHcih|&1PFcP`6ABj&%)!e3KGpO;PvpoL)2NKy*aNAmE^^IQ;_i$h6~Y4eNT*Y z@g9}PMb2+tC7Wv2P6~Sb5XbS1S(cX`s+@@c-ePRGIlg*mc&yv*)b0w$|DhdIdL` zc4Nw1p`R+)IvDafHCjiX`rH*->Q|S)@A)+JO$)yy)b-880i3vxVlh$-Q*EX(qG|ho zI$($m@{C>eh$z*#IeeJon)*|CTq|81w~+e$Gcn=`sP$XkLIr_xon@Hn#smZk2gNHP zNhD9o)i;wL?lBW4eU4;W65#iO7($ecIJ*MQVFiJ&hDGyXZ(2K~I*+ z=`^F$YU*Qbd zSK<6cFEY0YihDv1LpxvyvWK)K+Kx18?R=fim2Jcv^_7y#F+WT@uV_$)5FCPIi zT|~8Z8zXF3uLiA=C-lsSN8Q#S-l%IU@sjghev4DE=cU+X6{s7`TmOuDr?Ra}%O`Pn ze8zb?;b7GLnNw2j#O(oL1Lqn!mpZcDMYW;45#`$T8(v8UF*Tw9=XXu;gn2|9`9`E* zWcXo5OlBp1BiG7R62{>3J#MLvbcaTHodtDYhneVPDS;P_lgnsf)AA5ntbpzmQqt%7 z`^`$WaCrbyEdCo9!D|OtR1U4!S{KA3j|g>+3ckzfHfzS^b{{c7894{3Y`*>6$URfm zGH)<+kf)BwkjzSWZ-I4}piuGpGm@fX5@fbu+PJ4NOQ(?5`$S zkMU*_LVy`OhSP7Y-bNb0@B35()i%!9%D8Qg(TV$>Vp&Gb6Bf5J9(?aVvE5L1DP-D% zB`JJHuC#lBvrQx>Bj=_12YDY?*f}HNiMT6QiN`!4Ip5~906E&_ZrvGn{qzikh+b!y z{m4UmS{NF`U-NQXZo7?YjP4NWU<>y-@5dk{)FmCGD2udrpK}AX2?kNKS}kfcM$t3= zgSXW_*(nc!aspW2yLZ0Gq6BE)S&_0%qkdc|ego7!l&2dVc@9{O#^>#gN00%JU6<1a z|6PmCd}+AGCWJknlC8`%WW75{N9fPfF%(kNyS;ke&tr0a3zgMOom}@JlFSXP#tjFI zNV}%V;~?RdGKe6F?qlot!;pql&S!t^n`3A0?e<|QMVi>PshgHzy*#&wcCwOWdDyi4 zf_Rz7(hE;d2Yw>s1R4zw9I3&s$tdycKxIEix%EzBm$7AkwYER5%krh+OoVux7zESK{Ep%$**OL^ zt3qGsTA$QDrQ^_~Wk(6E^cc?(PMb?dR_FE62c&FuknPq?WEe*&DQM&fft*X7kto== zY|K>a+^Ov&#Mmz{9&l57OiTJliy4oI@ZCATHbP@(W-nAz{U2IPW{BkgNN;VaYZe)c z7gyS$bj&>FN_1m=?pS(PX`3&Z-N!P0kN``;bubOYQ#z9xV`-+5z~7+CJa?~&*47hU zGwhdxFTH7tJ9NIin_m^)--wdE1fcVD7wnMk9LiPQWplbo%P|;Ot}5(Ne5$qSQ19FF z=1@8EdAwjtTqO@$3jk)q5WIj|^805bK>L0pwH?`>o$5qa$|SqCMs zmCFH%GNN;Wx3BRn_DQdB*?$m}Jb~3D-Lq)K0}IJPpk^`-Bp6X zZb5ct3zK*2yXS3}BR9~}%OE5?s*d1qcMejF5jAC#YA1hDWwuT4F}|zRJyx_<)DzU@ z4!pIZ+oP{qQ9dlhR?7@@kB(h`h{bDn;|a7C)56v9vOaS-X95vJ2;p`SO2`fTz=k;FfF}76=CK{b$pUJ`~6M@wFS9kxFsh3KWLBLz}w|Pc^6* zHRA9tzicc+Qesw^JU4xy(UWBh`(DCdacfsQJ9&;AQ96pg6#Ej(X*jHY7{`8RG! z40M+hxE;Q(vb(#`xG~(`U5CRb!y7-5_$ZvbwVewRCSw$k#X4d$|3LHRmrIRv$w^ZoHg86mDv6Zck<9B;ZkD0AJ_SQDIPs4)!=r z>es(3&ip1fETLLnJvO$Ej2>Vi?($rLJMd)G)~JUH&#!9IKJDb{vAC!Bl4-6(dTQ(O z&d*{%WeO`4mbY6wLV+rwemnv$;Et^14 zVpi+906cTPLn$7>Q+f+6eP>*_91?DwG@+P#2U>a4lML;;_e7o<-x7O$9Wp>Y>WNHP zmnqoLGc+qCqT=TLi^UwSyu~?;Lu3#O2U^tu+bl8C>^wj1A76$%YkQ7LHMYVZP*dXf zl=A>(cpEaP8+|#@-E(eFo4Kp=ZpYa&WKFP4jvs`x!d6O*qL+Nh0*7$XcOo%MEL-J7WNR3pMtMFBmDIO$aNeld3`P5l) zsi*S7M)Bdd-NvqW($?bq548#|p^Ke!DceYG%&mb!%e{<6jMk#DTaJ1k_GyQsbBoq0 zg;yxloz=5QT3|HHG6b38K~^vHF@>^)kH$?XS!zTe335r@(l*1ctT|GK)eMjXIRpFV0VH z+{4Q**OD!aaXOV|1t60zaynNuOPfY`1T%dP_V2$Js~z%-FzvXju~XO57B5Kk&39ns z_YV$+JobDz7-$%RR+>hi2JgM+W*==~2ofC!9rg`$&1@---LZKqu8&s5j*nj7Rc~DQ z9J4F3ep>1kAKGi*@?|%2;7`z7OJ<*$ue7uLUDE+$#i}l(>tK1S_Ni0ll~ZOy>j5s5 z^HgNc7R|ot)E6aRs@f*dJ7ww_1f<@d{@PT&87ziJ@8U@MGJLewUz+V+;rqfa7lv;G zSx{M?(3K}X0Y$Iq9w(dsbRP{|6Mq7SdB4G~#l`0ZIn} zBTBCsHxV`NnZKr@4R8Iq9ig3KeoJAdSLPBDygUFeE(Ys8F}(MK4b>R)H+<>Vn>EHo zJ2PH#C~=9D>C^q0qBd%ok0d3#`cGA@XDh`F2<9HDsJiDc_+{8Km~KNOQi@X5!}-)3 zV1jstw7kE=4mLZJD(e{k+zunSLh=u`bTin;E0DL{K`piFtigOQ$dFG`6G|mS^J@7r z1h!<+rev*cz4kpHV)-$(V!Ww zbM*l`UKbu>+mZJ%1??asv7G4qFycpG+AF8xR!p&aXpcyp2gG5_TlW(EmTf`E2z)um z48FJSWfO+^f4aEps3@3sjerU&pe&)Z@X?5rbk~BExFES7AqXNM0@5J@0!v9Q-Hm{B zgXGfPDels-MXt+o(Ri^Mr^s2xAM&&Trbz(4X!3 zS`hoo>9}2P`4cnIl+?sf8C9w*Vr+Tl1q}gCN0k>;R;sa-QBE!4PJ36DD-N@q;opr{ z)fyk0`78Idrbwgc5^H@9CA_8H+-n!oEq~-k9XZm=OMe7hA2n3Xkjmv@3WY>_n$q$7 zTJETy7CwGF`$V&fT+jN{2?`r_hF3jhpVpHL!tX(aEU-$9W60=Gt?Os6^vbs#kC_AX z8=va7#o_nWe}`&xDp#fz%s!1yJiesvyAphwpZ(pr(9tewz24?A3+pP_1hr@)JJi6i z0aHiwUm#p7VuRVoY%uwLk(mN+EIhL`8 z0C+2}pY~90Of5ch?_6*iT|8!2;vvIz_C&KKXy9!DU=DjSt$WP~ko!L@0o!3*2osd` z-^IL_8#HdPtX7%yZozu#t6)6iR2`wrJ3&yC#g5lAY}7c^7dNCelh3mU4;?(^+#jrx z)sn`f?s?@#!A6{tB%g`B(|Lpw>7hssBOoMG%RB)-lmXLLG z5Bwh~MgFT=;c?iq1|`-B@jc$JGiK7Slk0$8ZKuG(qKI3%#=KNM^z&9F_ZA#7Z8}*9 zM9);{5h$0oVfDfFv&j4|+zP}w?25@&AvZcaC69mOT=h`N4@C%*JV)EvdpiKlR|@Yg zVNa8kOd1wuh^ekTT5Y}e`5c-k;+~ZW0^vDreWX@JK)mK&tsV%bj|9`J&KuBuj*GY(M-zzPEQE93o9l$ zfP8xQ@Wn!H$Uivzbq=Q-USUvvAxf^)pXA zV4hWVO-Ka%Ek3gI{`M@g>6=^p;YC>ldX+|=Yja}eymB6T{=|OV=OYKM-xIv!L`FfO zGr3b?3P;6U>?r6yiwO_VzUtKXyI22o`fvt`p9qnSWDxnY2XXtC>Vwi8=%tJVr8m(c znV4`3BEF&3@HCQodo?#)gOj~3Yh!7|r@UAhu4Sm|>i^NkG;Yc-MAv4`E_0yZ*#y4W zh2QDC9Gk3uux1)fDJK{+qI51hV9=gEp_RhmN!rklqX1nxbgKpQ+CY35ua$!%eG!fo zgw`j2v2EOmyf&|8fshU|RSBgj0mAZmY1P@kxiGUi7vm4*;x~xf6pQJ1^-wy!ZK z2mWMoiw>CBkDM=@zgf`MY@pC#nKWF^*1zj;CKGx>_Tsn|1}g_;6VxHW9- zoL=tR!B-#m)tu=81hN`YV^_CX10yWFXCFIadd9pEm3=Rx6NBObleP=@21{0Q|I$cI)!ixc46$sZJA>ODf)~nw27Ems&MmN6Y*Q z8}&q+;?dQ*the)pcg8|)zq0PP+`=gZxOhoVeh5Fhc=p#D?HnX1pKbdYqTrr>5)rYy zQguCsur)%XbjLF60YpK%v^G&-kGbf(hqH5reIaIzMxI>{bT~}I+?5b& zPd!LPS0`CrvS9K}Gj@Q6y%hBKly>3pNwwOBc=wRlOwI&nrP5%c^erGl2D9-p>cCT& zI_mHrP+gy?5h(h}(`0LhU9SC#!B4xcC%!o}vG?8_Q3$+Y)LcAvJCF>dzW5$x8nb)S z_oS&Q!;K|r6?6nL0J)>Ya+f5kk569lu=sd?-ejrh(64u8P#5qQfp|PcgWN}s{aT}Gn5V2bs3iC;0Oc1~F_M3|nLYsT|Vy9;354vko&+_VOyv>DN z8aQnIGlv+KHS65!^Ln4=y~j52)rd9Wv=Zk3n?hvEfz~Y(R?_}4w%|Zm?*4h zF#Lv{KIRpdYa|0O_cO78wST-xlKa%Lv`fveTSA$hOdtl(+Z9dJainwVB6V*x#T(lj#rM%RCN#Lna!IaC25=Fr?Nx2NOOmoKBoM&<{2h{A zkk!)?3!cJCzu%R{2#+v%CXE)khp&hV^zRo_cI_Dd{`oq81lBu^XmW(hn;_p{Cx?03gLyx@Nabq;t|ck2G}m_6?ojQI{=bnq6N`ZCzueLK z#lEF{H&BX2ZakvZ_Wt_bTkU^O2VBjM2Qn6Ht&hj7+8t~BQxok>`VC-rFIA0Ho~>qn zzy`l{UvZ9mEIj_R=JFM(gLUl96>* z*oloSg;3P7q5_K|9g>nSs0bJP{o>-HY24~Ek^iu})*4DL53rga`lCS9h!aeLrTc7$%Yw8IP!)HRfVDDf>%vT14 zah>c^_!+;2ib-g#ZJc&`M6u8)QV!+gcZH`s=L#?80PbikKlU@fINn2Fs&^yU>#Z8^snAvGAm%sU zoKcO4@8!QQ7AP`1Jq2~&L&E^Ea>+~5CX<0a_htSyi)}oU=v0PcD~_51Ai%1l;2N5} z%;}*wl2tjPf~`BPl5iPs$&C;Iio&{xa|6Gaz24LCag%qj#@Y($Qf(tXoaV5W=YMr( zxVvN!^*tYxMI&(JW75HR%f z!dG58_Z4Q!{Iv1m?N_Rv#6cAZS$>kf{LmjQp6l!xz8*3euAepZF7e!~)@RAC4*P2! zXHp7Q@%xYEs=`l4k8?NP=dJT-MbZ_g>o(2j=!>y5M$SmYh{Ycq{IlrI&-_w-O0~a2 zI2hGF9aD%FgN*KPiUs1RPc5X`7PG3jFbOi)YV!mY=5og?@zB`PQ(H)krn%0Dk_JRQOqzBh$Z^~#JvUs5?gsH6m;e=MJg zj)gWg^b(F}&Dgt!iGqlNs_qe+fDTUiQbRYOjT~%x%1TzZ-%04Qnf*=nKZ~VOG)V}u zHe!DuOLA?g{{|^>v$zF?1E(U}HGYYx@j5Agw7{sF!wQ z4wr6mIbisG7j|62703?(w~YEouD)^d+)W*nB{BWqiTY339I042YUD;=`l;ntsN2#Gj7lw8Z; z4`C`)M8l-xP*|opM*2?oDVdEM$IepnzpK{-2_hhxWgwSSv3FFMhGgC>NojG5AdHoJnLW_}GCo{oOi~*>$ z2qNgZ;jm7)RjNWDtf%XYhnNY#LWvTNjqMfl@&U5FgMg6&w}k({KmNCGb{vqo6h5RH zc13&2;tlbF!-)~xu$VC0f3F+8y`E3zqU|O5qu3i2CwW`%JsDo%DtxEV{{{5xpAT9z M)pS)Wl`TL24{iKN9{>OV literal 0 HcmV?d00001 diff --git a/dlp/snippets/resources/test.txt b/dlp/snippets/resources/test.txt new file mode 100644 index 000000000000..c2ee3815bc9b --- /dev/null +++ b/dlp/snippets/resources/test.txt @@ -0,0 +1 @@ +My phone number is (223) 456-7890 and my email address is gary@somedomain.com. \ No newline at end of file diff --git a/dlp/snippets/risk.py b/dlp/snippets/risk.py new file mode 100644 index 000000000000..065cdc6bf236 --- /dev/null +++ b/dlp/snippets/risk.py @@ -0,0 +1,939 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that uses the Data Loss Prevent API to perform risk anaylsis.""" + +from __future__ import print_function + +import argparse + + +# [START dlp_numerical_stats] +def numerical_risk_analysis( + project, + table_project_id, + dataset_id, + table_id, + column_name, + topic_id, + subscription_id, + timeout=300, +): + """Uses the Data Loss Prevention API to compute risk metrics of a column + of numerical data in a Google BigQuery table. + Args: + project: The Google Cloud project id to use as a parent resource. + table_project_id: The Google Cloud project id where the BigQuery table + is stored. + dataset_id: The id of the dataset to inspect. + table_id: The id of the table to inspect. + column_name: The name of the column to compute risk metrics for. + topic_id: The name of the Pub/Sub topic to notify once the job + completes. + subscription_id: The name of the Pub/Sub subscription to use when + listening for job completion notifications. + timeout: The number of seconds to wait for a response from the API. + + Returns: + None; the response from the API is printed to the terminal. + """ + import concurrent.futures + + # Import the client library. + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into full resource ids. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Location info of the BigQuery table. + source_table = { + "project_id": table_project_id, + "dataset_id": dataset_id, + "table_id": table_id, + } + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Configure risk analysis job + # Give the name of the numeric column to compute risk metrics for + risk_job = { + "privacy_metric": {"numerical_stats_config": {"field": {"name": column_name}}}, + "source_table": source_table, + "actions": actions, + } + + # Call API to start risk analysis job + operation = dlp.create_dlp_job(request={"parent": parent, "risk_job": risk_job}) + + def callback(message): + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + results = job.risk_details.numerical_stats_result + print( + "Value Range: [{}, {}]".format( + results.min_value.integer_value, + results.max_value.integer_value, + ) + ) + prev_value = None + for percent, result in enumerate(results.quantile_values): + value = result.integer_value + if prev_value != value: + print("Value at {}% quantile: {}".format(percent, value)) + prev_value = value + subscription.set_result(None) + else: + # This is not the message we're looking for. + message.drop() + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + subscription = subscriber.subscribe(subscription_path, callback) + + try: + subscription.result(timeout=timeout) + except concurrent.futures.TimeoutError: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + subscription.close() + + +# [END dlp_numerical_stats] + + +# [START dlp_categorical_stats] +def categorical_risk_analysis( + project, + table_project_id, + dataset_id, + table_id, + column_name, + topic_id, + subscription_id, + timeout=300, +): + """Uses the Data Loss Prevention API to compute risk metrics of a column + of categorical data in a Google BigQuery table. + Args: + project: The Google Cloud project id to use as a parent resource. + table_project_id: The Google Cloud project id where the BigQuery table + is stored. + dataset_id: The id of the dataset to inspect. + table_id: The id of the table to inspect. + column_name: The name of the column to compute risk metrics for. + topic_id: The name of the Pub/Sub topic to notify once the job + completes. + subscription_id: The name of the Pub/Sub subscription to use when + listening for job completion notifications. + timeout: The number of seconds to wait for a response from the API. + + Returns: + None; the response from the API is printed to the terminal. + """ + import concurrent.futures + + # Import the client library. + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into full resource ids. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Location info of the BigQuery table. + source_table = { + "project_id": table_project_id, + "dataset_id": dataset_id, + "table_id": table_id, + } + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Configure risk analysis job + # Give the name of the numeric column to compute risk metrics for + risk_job = { + "privacy_metric": { + "categorical_stats_config": {"field": {"name": column_name}} + }, + "source_table": source_table, + "actions": actions, + } + + # Call API to start risk analysis job + operation = dlp.create_dlp_job(request={"parent": parent, "risk_job": risk_job}) + + def callback(message): + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + histogram_buckets = ( + job.risk_details.categorical_stats_result.value_frequency_histogram_buckets # noqa: E501 + ) + # Print bucket stats + for i, bucket in enumerate(histogram_buckets): + print("Bucket {}:".format(i)) + print( + " Most common value occurs {} time(s)".format( + bucket.value_frequency_upper_bound + ) + ) + print( + " Least common value occurs {} time(s)".format( + bucket.value_frequency_lower_bound + ) + ) + print(" {} unique values total.".format(bucket.bucket_size)) + for value in bucket.bucket_values: + print( + " Value {} occurs {} time(s)".format( + value.value.integer_value, value.count + ) + ) + subscription.set_result(None) + else: + # This is not the message we're looking for. + message.drop() + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + subscription = subscriber.subscribe(subscription_path, callback) + + try: + subscription.result(timeout=timeout) + except concurrent.futures.TimeoutError: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + subscription.close() + + +# [END dlp_categorical_stats] + + +# [START dlp_k_anonymity] +def k_anonymity_analysis( + project, + table_project_id, + dataset_id, + table_id, + topic_id, + subscription_id, + quasi_ids, + timeout=300, +): + """Uses the Data Loss Prevention API to compute the k-anonymity of a + column set in a Google BigQuery table. + Args: + project: The Google Cloud project id to use as a parent resource. + table_project_id: The Google Cloud project id where the BigQuery table + is stored. + dataset_id: The id of the dataset to inspect. + table_id: The id of the table to inspect. + topic_id: The name of the Pub/Sub topic to notify once the job + completes. + subscription_id: The name of the Pub/Sub subscription to use when + listening for job completion notifications. + quasi_ids: A set of columns that form a composite key. + timeout: The number of seconds to wait for a response from the API. + + Returns: + None; the response from the API is printed to the terminal. + """ + import concurrent.futures + + # Import the client library. + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Create helper function for unpacking values + def get_values(obj): + return int(obj.integer_value) + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Location info of the BigQuery table. + source_table = { + "project_id": table_project_id, + "dataset_id": dataset_id, + "table_id": table_id, + } + + # Convert quasi id list to Protobuf type + def map_fields(field): + return {"name": field} + + quasi_ids = map(map_fields, quasi_ids) + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Configure risk analysis job + # Give the name of the numeric column to compute risk metrics for + risk_job = { + "privacy_metric": {"k_anonymity_config": {"quasi_ids": quasi_ids}}, + "source_table": source_table, + "actions": actions, + } + + # Call API to start risk analysis job + operation = dlp.create_dlp_job(request={"parent": parent, "risk_job": risk_job}) + + def callback(message): + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + histogram_buckets = ( + job.risk_details.k_anonymity_result.equivalence_class_histogram_buckets + ) + # Print bucket stats + for i, bucket in enumerate(histogram_buckets): + print("Bucket {}:".format(i)) + if bucket.equivalence_class_size_lower_bound: + print( + " Bucket size range: [{}, {}]".format( + bucket.equivalence_class_size_lower_bound, + bucket.equivalence_class_size_upper_bound, + ) + ) + for value_bucket in bucket.bucket_values: + print( + " Quasi-ID values: {}".format( + map(get_values, value_bucket.quasi_ids_values) + ) + ) + print( + " Class size: {}".format( + value_bucket.equivalence_class_size + ) + ) + subscription.set_result(None) + else: + # This is not the message we're looking for. + message.drop() + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + subscription = subscriber.subscribe(subscription_path, callback) + + try: + subscription.result(timeout=timeout) + except concurrent.futures.TimeoutError: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + subscription.close() + + +# [END dlp_k_anonymity] + + +# [START dlp_l_diversity] +def l_diversity_analysis( + project, + table_project_id, + dataset_id, + table_id, + topic_id, + subscription_id, + sensitive_attribute, + quasi_ids, + timeout=300, +): + """Uses the Data Loss Prevention API to compute the l-diversity of a + column set in a Google BigQuery table. + Args: + project: The Google Cloud project id to use as a parent resource. + table_project_id: The Google Cloud project id where the BigQuery table + is stored. + dataset_id: The id of the dataset to inspect. + table_id: The id of the table to inspect. + topic_id: The name of the Pub/Sub topic to notify once the job + completes. + subscription_id: The name of the Pub/Sub subscription to use when + listening for job completion notifications. + sensitive_attribute: The column to measure l-diversity relative to. + quasi_ids: A set of columns that form a composite key. + timeout: The number of seconds to wait for a response from the API. + + Returns: + None; the response from the API is printed to the terminal. + """ + import concurrent.futures + + # Import the client library. + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Create helper function for unpacking values + def get_values(obj): + return int(obj.integer_value) + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Location info of the BigQuery table. + source_table = { + "project_id": table_project_id, + "dataset_id": dataset_id, + "table_id": table_id, + } + + # Convert quasi id list to Protobuf type + def map_fields(field): + return {"name": field} + + quasi_ids = map(map_fields, quasi_ids) + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Configure risk analysis job + # Give the name of the numeric column to compute risk metrics for + risk_job = { + "privacy_metric": { + "l_diversity_config": { + "quasi_ids": quasi_ids, + "sensitive_attribute": {"name": sensitive_attribute}, + } + }, + "source_table": source_table, + "actions": actions, + } + + # Call API to start risk analysis job + operation = dlp.create_dlp_job(request={"parent": parent, "risk_job": risk_job}) + + def callback(message): + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + histogram_buckets = ( + job.risk_details.l_diversity_result.sensitive_value_frequency_histogram_buckets # noqa: E501 + ) + # Print bucket stats + for i, bucket in enumerate(histogram_buckets): + print("Bucket {}:".format(i)) + print( + " Bucket size range: [{}, {}]".format( + bucket.sensitive_value_frequency_lower_bound, + bucket.sensitive_value_frequency_upper_bound, + ) + ) + for value_bucket in bucket.bucket_values: + print( + " Quasi-ID values: {}".format( + map(get_values, value_bucket.quasi_ids_values) + ) + ) + print( + " Class size: {}".format(value_bucket.equivalence_class_size) + ) + for value in value_bucket.top_sensitive_values: + print( + ( + " Sensitive value {} occurs {} time(s)".format( + value.value, value.count + ) + ) + ) + subscription.set_result(None) + else: + # This is not the message we're looking for. + message.drop() + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + subscription = subscriber.subscribe(subscription_path, callback) + + try: + subscription.result(timeout=timeout) + except concurrent.futures.TimeoutError: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + subscription.close() + + +# [END dlp_l_diversity] + + +# [START dlp_k_map] +def k_map_estimate_analysis( + project, + table_project_id, + dataset_id, + table_id, + topic_id, + subscription_id, + quasi_ids, + info_types, + region_code="US", + timeout=300, +): + """Uses the Data Loss Prevention API to compute the k-map risk estimation + of a column set in a Google BigQuery table. + Args: + project: The Google Cloud project id to use as a parent resource. + table_project_id: The Google Cloud project id where the BigQuery table + is stored. + dataset_id: The id of the dataset to inspect. + table_id: The id of the table to inspect. + column_name: The name of the column to compute risk metrics for. + topic_id: The name of the Pub/Sub topic to notify once the job + completes. + subscription_id: The name of the Pub/Sub subscription to use when + listening for job completion notifications. + quasi_ids: A set of columns that form a composite key and optionally + their reidentification distributions. + info_types: Type of information of the quasi_id in order to provide a + statistical model of population. + region_code: The ISO 3166-1 region code that the data is representative + of. Can be omitted if using a region-specific infoType (such as + US_ZIP_5) + timeout: The number of seconds to wait for a response from the API. + + Returns: + None; the response from the API is printed to the terminal. + """ + import concurrent.futures + + # Import the client library. + import google.cloud.dlp + + # This sample additionally uses Cloud Pub/Sub to receive results from + # potentially long-running operations. + import google.cloud.pubsub + + # Create helper function for unpacking values + def get_values(obj): + return int(obj.integer_value) + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into full resource ids. + topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id) + parent = f"projects/{project}/locations/global" + + # Location info of the BigQuery table. + source_table = { + "project_id": table_project_id, + "dataset_id": dataset_id, + "table_id": table_id, + } + + # Check that numbers of quasi-ids and info types are equal + if len(quasi_ids) != len(info_types): + raise ValueError( + """Number of infoTypes and number of quasi-identifiers + must be equal!""" + ) + + # Convert quasi id list to Protobuf type + def map_fields(quasi_id, info_type): + return {"field": {"name": quasi_id}, "info_type": {"name": info_type}} + + quasi_ids = map(map_fields, quasi_ids, info_types) + + # Tell the API where to send a notification when the job is complete. + actions = [{"pub_sub": {"topic": topic}}] + + # Configure risk analysis job + # Give the name of the numeric column to compute risk metrics for + risk_job = { + "privacy_metric": { + "k_map_estimation_config": { + "quasi_ids": quasi_ids, + "region_code": region_code, + } + }, + "source_table": source_table, + "actions": actions, + } + + # Call API to start risk analysis job + operation = dlp.create_dlp_job(request={"parent": parent, "risk_job": risk_job}) + + def callback(message): + if message.attributes["DlpJobName"] == operation.name: + # This is the message we're looking for, so acknowledge it. + message.ack() + + # Now that the job is done, fetch the results and print them. + job = dlp.get_dlp_job(request={"name": operation.name}) + print(f"Job name: {job.name}") + histogram_buckets = ( + job.risk_details.k_map_estimation_result.k_map_estimation_histogram + ) + # Print bucket stats + for i, bucket in enumerate(histogram_buckets): + print("Bucket {}:".format(i)) + print( + " Anonymity range: [{}, {}]".format( + bucket.min_anonymity, bucket.max_anonymity + ) + ) + print(" Size: {}".format(bucket.bucket_size)) + for value_bucket in bucket.bucket_values: + print( + " Values: {}".format( + map(get_values, value_bucket.quasi_ids_values) + ) + ) + print( + " Estimated k-map anonymity: {}".format( + value_bucket.estimated_anonymity + ) + ) + subscription.set_result(None) + else: + # This is not the message we're looking for. + message.drop() + + # Create a Pub/Sub client and find the subscription. The subscription is + # expected to already be listening to the topic. + subscriber = google.cloud.pubsub.SubscriberClient() + subscription_path = subscriber.subscription_path(project, subscription_id) + subscription = subscriber.subscribe(subscription_path, callback) + + try: + subscription.result(timeout=timeout) + except concurrent.futures.TimeoutError: + print( + "No event received before the timeout. Please verify that the " + "subscription provided is subscribed to the topic provided." + ) + subscription.close() + + +# [END dlp_k_map] + + +if __name__ == "__main__": + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="content", help="Select how to submit content to the API." + ) + subparsers.required = True + + numerical_parser = subparsers.add_parser("numerical", help="") + numerical_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + numerical_parser.add_argument( + "table_project_id", + help="The Google Cloud project id where the BigQuery table is stored.", + ) + numerical_parser.add_argument( + "dataset_id", help="The id of the dataset to inspect." + ) + numerical_parser.add_argument("table_id", help="The id of the table to inspect.") + numerical_parser.add_argument( + "column_name", + help="The name of the column to compute risk metrics for.", + ) + numerical_parser.add_argument( + "topic_id", + help="The name of the Pub/Sub topic to notify once the job completes.", + ) + numerical_parser.add_argument( + "subscription_id", + help="The name of the Pub/Sub subscription to use when listening for" + "job completion notifications.", + ) + numerical_parser.add_argument( + "--timeout", + type=int, + help="The number of seconds to wait for a response from the API.", + ) + + categorical_parser = subparsers.add_parser("categorical", help="") + categorical_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + categorical_parser.add_argument( + "table_project_id", + help="The Google Cloud project id where the BigQuery table is stored.", + ) + categorical_parser.add_argument( + "dataset_id", help="The id of the dataset to inspect." + ) + categorical_parser.add_argument("table_id", help="The id of the table to inspect.") + categorical_parser.add_argument( + "column_name", + help="The name of the column to compute risk metrics for.", + ) + categorical_parser.add_argument( + "topic_id", + help="The name of the Pub/Sub topic to notify once the job completes.", + ) + categorical_parser.add_argument( + "subscription_id", + help="The name of the Pub/Sub subscription to use when listening for" + "job completion notifications.", + ) + categorical_parser.add_argument( + "--timeout", + type=int, + help="The number of seconds to wait for a response from the API.", + ) + + k_anonymity_parser = subparsers.add_parser( + "k_anonymity", + help="Computes the k-anonymity of a column set in a Google BigQuery" "table.", + ) + k_anonymity_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + k_anonymity_parser.add_argument( + "table_project_id", + help="The Google Cloud project id where the BigQuery table is stored.", + ) + k_anonymity_parser.add_argument( + "dataset_id", help="The id of the dataset to inspect." + ) + k_anonymity_parser.add_argument("table_id", help="The id of the table to inspect.") + k_anonymity_parser.add_argument( + "topic_id", + help="The name of the Pub/Sub topic to notify once the job completes.", + ) + k_anonymity_parser.add_argument( + "subscription_id", + help="The name of the Pub/Sub subscription to use when listening for" + "job completion notifications.", + ) + k_anonymity_parser.add_argument( + "quasi_ids", + nargs="+", + help="A set of columns that form a composite key.", + ) + k_anonymity_parser.add_argument( + "--timeout", + type=int, + help="The number of seconds to wait for a response from the API.", + ) + + l_diversity_parser = subparsers.add_parser( + "l_diversity", + help="Computes the l-diversity of a column set in a Google BigQuery" "table.", + ) + l_diversity_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + l_diversity_parser.add_argument( + "table_project_id", + help="The Google Cloud project id where the BigQuery table is stored.", + ) + l_diversity_parser.add_argument( + "dataset_id", help="The id of the dataset to inspect." + ) + l_diversity_parser.add_argument("table_id", help="The id of the table to inspect.") + l_diversity_parser.add_argument( + "topic_id", + help="The name of the Pub/Sub topic to notify once the job completes.", + ) + l_diversity_parser.add_argument( + "subscription_id", + help="The name of the Pub/Sub subscription to use when listening for" + "job completion notifications.", + ) + l_diversity_parser.add_argument( + "sensitive_attribute", + help="The column to measure l-diversity relative to.", + ) + l_diversity_parser.add_argument( + "quasi_ids", + nargs="+", + help="A set of columns that form a composite key.", + ) + l_diversity_parser.add_argument( + "--timeout", + type=int, + help="The number of seconds to wait for a response from the API.", + ) + + k_map_parser = subparsers.add_parser( + "k_map", + help="Computes the k-map risk estimation of a column set in a Google" + "BigQuery table.", + ) + k_map_parser.add_argument( + "project", + help="The Google Cloud project id to use as a parent resource.", + ) + k_map_parser.add_argument( + "table_project_id", + help="The Google Cloud project id where the BigQuery table is stored.", + ) + k_map_parser.add_argument("dataset_id", help="The id of the dataset to inspect.") + k_map_parser.add_argument("table_id", help="The id of the table to inspect.") + k_map_parser.add_argument( + "topic_id", + help="The name of the Pub/Sub topic to notify once the job completes.", + ) + k_map_parser.add_argument( + "subscription_id", + help="The name of the Pub/Sub subscription to use when listening for" + "job completion notifications.", + ) + k_map_parser.add_argument( + "quasi_ids", + nargs="+", + help="A set of columns that form a composite key.", + ) + k_map_parser.add_argument( + "-t", + "--info-types", + nargs="+", + help="Type of information of the quasi_id in order to provide a" + "statistical model of population.", + required=True, + ) + k_map_parser.add_argument( + "-r", + "--region-code", + default="US", + help="The ISO 3166-1 region code that the data is representative of.", + ) + k_map_parser.add_argument( + "--timeout", + type=int, + help="The number of seconds to wait for a response from the API.", + ) + + args = parser.parse_args() + + if args.content == "numerical": + numerical_risk_analysis( + args.project, + args.table_project_id, + args.dataset_id, + args.table_id, + args.column_name, + args.topic_id, + args.subscription_id, + timeout=args.timeout, + ) + elif args.content == "categorical": + categorical_risk_analysis( + args.project, + args.table_project_id, + args.dataset_id, + args.table_id, + args.column_name, + args.topic_id, + args.subscription_id, + timeout=args.timeout, + ) + elif args.content == "k_anonymity": + k_anonymity_analysis( + args.project, + args.table_project_id, + args.dataset_id, + args.table_id, + args.topic_id, + args.subscription_id, + args.quasi_ids, + timeout=args.timeout, + ) + elif args.content == "l_diversity": + l_diversity_analysis( + args.project, + args.table_project_id, + args.dataset_id, + args.table_id, + args.topic_id, + args.subscription_id, + args.sensitive_attribute, + args.quasi_ids, + timeout=args.timeout, + ) + elif args.content == "k_map": + k_map_estimate_analysis( + args.project, + args.table_project_id, + args.dataset_id, + args.table_id, + args.topic_id, + args.subscription_id, + args.quasi_ids, + args.info_types, + region_code=args.region_code, + timeout=args.timeout, + ) diff --git a/dlp/snippets/risk_test.py b/dlp/snippets/risk_test.py new file mode 100644 index 000000000000..cbc596122743 --- /dev/null +++ b/dlp/snippets/risk_test.py @@ -0,0 +1,398 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import uuid + +import google.cloud.bigquery +import google.cloud.dlp_v2 +import google.cloud.pubsub +import pytest + +import risk + +UNIQUE_STRING = str(uuid.uuid4()).split("-")[0] +GCLOUD_PROJECT = os.environ.get("GOOGLE_CLOUD_PROJECT") +TABLE_PROJECT = os.environ.get("GOOGLE_CLOUD_PROJECT") +TOPIC_ID = "dlp-test" + UNIQUE_STRING +SUBSCRIPTION_ID = "dlp-test-subscription" + UNIQUE_STRING +UNIQUE_FIELD = "Name" +REPEATED_FIELD = "Mystery" +NUMERIC_FIELD = "Age" +STRING_BOOLEAN_FIELD = "Gender" + +BIGQUERY_DATASET_ID = "dlp_test_dataset" + UNIQUE_STRING +BIGQUERY_TABLE_ID = "dlp_test_table" + UNIQUE_STRING +BIGQUERY_HARMFUL_TABLE_ID = "harmful" + UNIQUE_STRING +DLP_CLIENT = google.cloud.dlp_v2.DlpServiceClient() + + +# Create new custom topic/subscription +# We observe sometimes all the tests in this file fail. In a +# hypothesis where DLP service somehow loses the connection to the +# topic, now we use function scope for Pub/Sub fixtures. +@pytest.fixture(scope="module") +def topic_id(): + # Creates a pubsub topic, and tears it down. + publisher = google.cloud.pubsub.PublisherClient() + topic_path = publisher.topic_path(GCLOUD_PROJECT, TOPIC_ID) + try: + publisher.create_topic(request={"name": topic_path}) + except google.api_core.exceptions.AlreadyExists: + pass + + yield TOPIC_ID + + publisher.delete_topic(request={"topic": topic_path}) + + +@pytest.fixture(scope="module") +def subscription_id(topic_id): + # Subscribes to a topic. + subscriber = google.cloud.pubsub.SubscriberClient() + topic_path = subscriber.topic_path(GCLOUD_PROJECT, topic_id) + subscription_path = subscriber.subscription_path(GCLOUD_PROJECT, SUBSCRIPTION_ID) + try: + subscriber.create_subscription( + request={"name": subscription_path, "topic": topic_path} + ) + except google.api_core.exceptions.AlreadyExists: + pass + + yield SUBSCRIPTION_ID + + subscriber.delete_subscription(request={"subscription": subscription_path}) + + +@pytest.fixture(scope="module") +def bigquery_project(): + # Adds test Bigquery data, yields the project ID and then tears down. + + bigquery_client = google.cloud.bigquery.Client() + + dataset_ref = bigquery_client.dataset(BIGQUERY_DATASET_ID) + dataset = google.cloud.bigquery.Dataset(dataset_ref) + try: + dataset = bigquery_client.create_dataset(dataset) + except google.api_core.exceptions.Conflict: + dataset = bigquery_client.get_dataset(dataset) + table_ref = dataset_ref.table(BIGQUERY_TABLE_ID) + table = google.cloud.bigquery.Table(table_ref) + + harmful_table_ref = dataset_ref.table(BIGQUERY_HARMFUL_TABLE_ID) + harmful_table = google.cloud.bigquery.Table(harmful_table_ref) + + table.schema = ( + google.cloud.bigquery.SchemaField("Name", "STRING"), + google.cloud.bigquery.SchemaField("Comment", "STRING"), + ) + + harmful_table.schema = ( + google.cloud.bigquery.SchemaField("Name", "STRING", "REQUIRED"), + google.cloud.bigquery.SchemaField("TelephoneNumber", "STRING", "REQUIRED"), + google.cloud.bigquery.SchemaField("Mystery", "STRING", "REQUIRED"), + google.cloud.bigquery.SchemaField("Age", "INTEGER", "REQUIRED"), + google.cloud.bigquery.SchemaField("Gender", "STRING"), + google.cloud.bigquery.SchemaField("RegionCode", "STRING"), + ) + + try: + table = bigquery_client.create_table(table) + except google.api_core.exceptions.Conflict: + table = bigquery_client.get_table(table) + + try: + harmful_table = bigquery_client.create_table(harmful_table) + except google.api_core.exceptions.Conflict: + harmful_table = bigquery_client.get_table(harmful_table) + + rows_to_insert = [("Gary Smith", "My email is gary@example.com")] + harmful_rows_to_insert = [ + ( + "Gandalf", + "(123) 456-7890", + "4231 5555 6781 9876", + 27, + "Male", + "US", + ), + ( + "Dumbledore", + "(313) 337-1337", + "6291 8765 1095 7629", + 27, + "Male", + "US", + ), + ("Joe", "(452) 123-1234", "3782 2288 1166 3030", 35, "Male", "US"), + ("James", "(567) 890-1234", "8291 3627 8250 1234", 19, "Male", "US"), + ( + "Marie", + "(452) 123-1234", + "8291 3627 8250 1234", + 35, + "Female", + "US", + ), + ( + "Carrie", + "(567) 890-1234", + "2253 5218 4251 4526", + 35, + "Female", + "US", + ), + ] + + bigquery_client.insert_rows(table, rows_to_insert) + bigquery_client.insert_rows(harmful_table, harmful_rows_to_insert) + yield GCLOUD_PROJECT + + bigquery_client.delete_dataset(dataset_ref, delete_contents=True) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_numerical_risk_analysis(topic_id, subscription_id, bigquery_project, capsys): + risk.numerical_risk_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + NUMERIC_FIELD, + topic_id, + subscription_id, + ) + + out, _ = capsys.readouterr() + assert "Value Range:" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_categorical_risk_analysis_on_string_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.categorical_risk_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + UNIQUE_FIELD, + topic_id, + subscription_id, + ) + + out, _ = capsys.readouterr() + assert "Most common value occurs" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_categorical_risk_analysis_on_number_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.categorical_risk_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + NUMERIC_FIELD, + topic_id, + subscription_id, + ) + + out, _ = capsys.readouterr() + assert "Most common value occurs" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_k_anonymity_analysis_single_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.k_anonymity_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + [NUMERIC_FIELD], + ) + + out, _ = capsys.readouterr() + assert "Quasi-ID values:" in out + assert "Class size:" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_k_anonymity_analysis_multiple_fields( + topic_id, subscription_id, bigquery_project, capsys +): + risk.k_anonymity_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + [NUMERIC_FIELD, REPEATED_FIELD], + ) + + out, _ = capsys.readouterr() + assert "Quasi-ID values:" in out + assert "Class size:" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_l_diversity_analysis_single_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.l_diversity_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + UNIQUE_FIELD, + [NUMERIC_FIELD], + ) + + out, _ = capsys.readouterr() + assert "Quasi-ID values:" in out + assert "Class size:" in out + assert "Sensitive value" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_l_diversity_analysis_multiple_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.l_diversity_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + UNIQUE_FIELD, + [NUMERIC_FIELD, REPEATED_FIELD], + ) + + out, _ = capsys.readouterr() + assert "Quasi-ID values:" in out + assert "Class size:" in out + assert "Sensitive value" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_k_map_estimate_analysis_single_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.k_map_estimate_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + [NUMERIC_FIELD], + ["AGE"], + ) + + out, _ = capsys.readouterr() + assert "Anonymity range:" in out + assert "Size:" in out + assert "Values" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=5, min_passes=1) +def test_k_map_estimate_analysis_multiple_field( + topic_id, subscription_id, bigquery_project, capsys +): + risk.k_map_estimate_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + [NUMERIC_FIELD, STRING_BOOLEAN_FIELD], + ["AGE", "GENDER"], + ) + + out, _ = capsys.readouterr() + assert "Anonymity range:" in out + assert "Size:" in out + assert "Values" in out + assert "Job name:" in out + for line in str(out).split("\n"): + if "Job name" in line: + job_name = line.split(":")[1].strip() + DLP_CLIENT.delete_dlp_job(name=job_name) + + +@pytest.mark.flaky(max_runs=3, min_passes=1) +def test_k_map_estimate_analysis_quasi_ids_info_types_equal( + topic_id, subscription_id, bigquery_project +): + with pytest.raises(ValueError): + risk.k_map_estimate_analysis( + GCLOUD_PROJECT, + TABLE_PROJECT, + BIGQUERY_DATASET_ID, + BIGQUERY_HARMFUL_TABLE_ID, + topic_id, + subscription_id, + [NUMERIC_FIELD, STRING_BOOLEAN_FIELD], + ["AGE"], + ) diff --git a/dlp/snippets/templates.py b/dlp/snippets/templates.py new file mode 100644 index 000000000000..6c618a0a7493 --- /dev/null +++ b/dlp/snippets/templates.py @@ -0,0 +1,255 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that sets up Data Loss Prevention API inspect templates.""" + +from __future__ import print_function + +import argparse +import os + + +# [START dlp_create_inspect_template] +def create_inspect_template( + project, + info_types, + template_id=None, + display_name=None, + min_likelihood=None, + max_findings=None, + include_quote=None, +): + """Creates a Data Loss Prevention API inspect template. + Args: + project: The Google Cloud project id to use as a parent resource. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + template_id: The id of the template. If omitted, an id will be randomly + generated. + display_name: The optional display name of the template. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + include_quote: Boolean for whether to display a quote of the detected + information in the results. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + info_types = [{"name": info_type} for info_type in info_types] + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "min_likelihood": min_likelihood, + "include_quote": include_quote, + "limits": {"max_findings_per_request": max_findings}, + } + + inspect_template = { + "inspect_config": inspect_config, + "display_name": display_name, + } + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.create_inspect_template( + request={ + "parent": parent, + "inspect_template": inspect_template, + "template_id": template_id, + } + ) + + print("Successfully created template {}".format(response.name)) + + +# [END dlp_create_inspect_template] + + +# [START dlp_list_templates] +def list_inspect_templates(project): + """Lists all Data Loss Prevention API inspect templates. + Args: + project: The Google Cloud project id to use as a parent resource. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.list_inspect_templates(request={"parent": parent}) + + for template in response: + print("Template {}:".format(template.name)) + if template.display_name: + print(" Display Name: {}".format(template.display_name)) + print(" Created: {}".format(template.create_time)) + print(" Updated: {}".format(template.update_time)) + + config = template.inspect_config + print( + " InfoTypes: {}".format(", ".join([it.name for it in config.info_types])) + ) + print(" Minimum likelihood: {}".format(config.min_likelihood)) + print(" Include quotes: {}".format(config.include_quote)) + print( + " Max findings per request: {}".format( + config.limits.max_findings_per_request + ) + ) + + +# [END dlp_list_templates] + + +# [START dlp_delete_inspect_template] +def delete_inspect_template(project, template_id): + """Deletes a Data Loss Prevention API template. + Args: + project: The id of the Google Cloud project which owns the template. + template_id: The id of the template to delete. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Combine the template id with the parent id. + template_resource = "{}/inspectTemplates/{}".format(parent, template_id) + + # Call the API. + dlp.delete_inspect_template(request={"name": template_resource}) + + print("Template {} successfully deleted.".format(template_resource)) + + +# [END dlp_delete_inspect_template] + + +if __name__ == "__main__": + default_project = os.environ.get("GOOGLE_CLOUD_PROJECT") + + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="action", help="Select which action to perform." + ) + subparsers.required = True + + parser_create = subparsers.add_parser("create", help="Create a template.") + parser_create.add_argument( + "--template_id", + help="The id of the template. If omitted, an id will be randomly " "generated", + ) + parser_create.add_argument( + "--display_name", help="The optional display name of the template." + ) + parser_create.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_create.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_create.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_create.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_create.add_argument( + "--include_quote", + type=bool, + help="A boolean for whether to display a quote of the detected " + "information in the results.", + default=True, + ) + + parser_list = subparsers.add_parser("list", help="List all templates.") + parser_list.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + + parser_delete = subparsers.add_parser("delete", help="Delete a template.") + parser_delete.add_argument("template_id", help="The id of the template to delete.") + parser_delete.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + + args = parser.parse_args() + + if args.action == "create": + create_inspect_template( + args.project, + args.info_types, + template_id=args.template_id, + display_name=args.display_name, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + include_quote=args.include_quote, + ) + elif args.action == "list": + list_inspect_templates(args.project) + elif args.action == "delete": + delete_inspect_template(args.project, args.template_id) diff --git a/dlp/snippets/templates_test.py b/dlp/snippets/templates_test.py new file mode 100644 index 000000000000..4682f47cbd0b --- /dev/null +++ b/dlp/snippets/templates_test.py @@ -0,0 +1,60 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import uuid + +import google.api_core.exceptions +import google.cloud.storage + +import templates + +UNIQUE_STRING = str(uuid.uuid4()).split("-")[0] +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") +TEST_TEMPLATE_ID = "test-template" + UNIQUE_STRING + + +def test_create_list_and_delete_template(capsys): + try: + templates.create_inspect_template( + GCLOUD_PROJECT, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + template_id=TEST_TEMPLATE_ID, + ) + except google.api_core.exceptions.InvalidArgument: + # Template already exists, perhaps due to a previous interrupted test. + templates.delete_inspect_template(GCLOUD_PROJECT, TEST_TEMPLATE_ID) + + out, _ = capsys.readouterr() + assert TEST_TEMPLATE_ID in out + + # Try again and move on. + templates.create_inspect_template( + GCLOUD_PROJECT, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + template_id=TEST_TEMPLATE_ID, + ) + + out, _ = capsys.readouterr() + assert TEST_TEMPLATE_ID in out + + templates.list_inspect_templates(GCLOUD_PROJECT) + + out, _ = capsys.readouterr() + assert TEST_TEMPLATE_ID in out + + templates.delete_inspect_template(GCLOUD_PROJECT, TEST_TEMPLATE_ID) + + out, _ = capsys.readouterr() + assert TEST_TEMPLATE_ID in out diff --git a/dlp/snippets/triggers.py b/dlp/snippets/triggers.py new file mode 100644 index 000000000000..11acd6546f29 --- /dev/null +++ b/dlp/snippets/triggers.py @@ -0,0 +1,286 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Sample app that sets up Data Loss Prevention API automation triggers.""" + +from __future__ import print_function + +import argparse +import os + + +# [START dlp_create_trigger] +def create_trigger( + project, + bucket, + scan_period_days, + info_types, + trigger_id=None, + display_name=None, + description=None, + min_likelihood=None, + max_findings=None, + auto_populate_timespan=False, +): + """Creates a scheduled Data Loss Prevention API inspect_content trigger. + Args: + project: The Google Cloud project id to use as a parent resource. + bucket: The name of the GCS bucket to scan. This sample scans all + files in the bucket using a wildcard. + scan_period_days: How often to repeat the scan, in days. + The minimum is 1 day. + info_types: A list of strings representing info types to look for. + A full list of info type categories can be fetched from the API. + trigger_id: The id of the trigger. If omitted, an id will be randomly + generated. + display_name: The optional display name of the trigger. + description: The optional description of the trigger. + min_likelihood: A string representing the minimum likelihood threshold + that constitutes a match. One of: 'LIKELIHOOD_UNSPECIFIED', + 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE', 'LIKELY', 'VERY_LIKELY'. + max_findings: The maximum number of findings to report; 0 = no maximum. + auto_populate_timespan: Automatically populates time span config start + and end times in order to scan new content only. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Prepare info_types by converting the list of strings into a list of + # dictionaries (protos are also accepted). + info_types = [{"name": info_type} for info_type in info_types] + + # Construct the configuration dictionary. Keys which are None may + # optionally be omitted entirely. + inspect_config = { + "info_types": info_types, + "min_likelihood": min_likelihood, + "limits": {"max_findings_per_request": max_findings}, + } + + # Construct a cloud_storage_options dictionary with the bucket's URL. + url = "gs://{}/*".format(bucket) + storage_config = { + "cloud_storage_options": {"file_set": {"url": url}}, + # Time-based configuration for each storage object. + "timespan_config": { + # Auto-populate start and end times in order to scan new objects + # only. + "enable_auto_population_of_timespan_config": auto_populate_timespan + }, + } + + # Construct the job definition. + job = {"inspect_config": inspect_config, "storage_config": storage_config} + + # Construct the schedule definition: + schedule = { + "recurrence_period_duration": {"seconds": scan_period_days * 60 * 60 * 24} + } + + # Construct the trigger definition. + job_trigger = { + "inspect_job": job, + "display_name": display_name, + "description": description, + "triggers": [{"schedule": schedule}], + "status": google.cloud.dlp_v2.JobTrigger.Status.HEALTHY, + } + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.create_job_trigger( + request={"parent": parent, "job_trigger": job_trigger, "trigger_id": trigger_id} + ) + + print("Successfully created trigger {}".format(response.name)) + + +# [END dlp_create_trigger] + + +# [START dlp_list_triggers] +def list_triggers(project): + """Lists all Data Loss Prevention API triggers. + Args: + project: The Google Cloud project id to use as a parent resource. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Call the API. + response = dlp.list_job_triggers(request={"parent": parent}) + + for trigger in response: + print("Trigger {}:".format(trigger.name)) + print(" Created: {}".format(trigger.create_time)) + print(" Updated: {}".format(trigger.update_time)) + if trigger.display_name: + print(" Display Name: {}".format(trigger.display_name)) + if trigger.description: + print(" Description: {}".format(trigger.discription)) + print(" Status: {}".format(trigger.status)) + print(" Error count: {}".format(len(trigger.errors))) + + +# [END dlp_list_triggers] + + +# [START dlp_delete_trigger] +def delete_trigger(project, trigger_id): + """Deletes a Data Loss Prevention API trigger. + Args: + project: The id of the Google Cloud project which owns the trigger. + trigger_id: The id of the trigger to delete. + Returns: + None; the response from the API is printed to the terminal. + """ + + # Import the client library + import google.cloud.dlp + + # Instantiate a client. + dlp = google.cloud.dlp_v2.DlpServiceClient() + + # Convert the project id into a full resource id. + parent = f"projects/{project}" + + # Combine the trigger id with the parent id. + trigger_resource = "{}/jobTriggers/{}".format(parent, trigger_id) + + # Call the API. + dlp.delete_job_trigger(request={"name": trigger_resource}) + + print("Trigger {} successfully deleted.".format(trigger_resource)) + + +# [END dlp_delete_trigger] + + +if __name__ == "__main__": + default_project = os.environ.get("GOOGLE_CLOUD_PROJECT") + + parser = argparse.ArgumentParser(description=__doc__) + subparsers = parser.add_subparsers( + dest="action", help="Select which action to perform." + ) + subparsers.required = True + + parser_create = subparsers.add_parser("create", help="Create a trigger.") + parser_create.add_argument( + "bucket", help="The name of the GCS bucket containing the file." + ) + parser_create.add_argument( + "scan_period_days", + type=int, + help="How often to repeat the scan, in days. The minimum is 1 day.", + ) + parser_create.add_argument( + "--trigger_id", + help="The id of the trigger. If omitted, an id will be randomly " "generated", + ) + parser_create.add_argument( + "--display_name", help="The optional display name of the trigger." + ) + parser_create.add_argument( + "--description", help="The optional description of the trigger." + ) + parser_create.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + parser_create.add_argument( + "--info_types", + nargs="+", + help="Strings representing info types to look for. A full list of " + "info categories and types is available from the API. Examples " + 'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". ' + "If unspecified, the three above examples will be used.", + default=["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"], + ) + parser_create.add_argument( + "--min_likelihood", + choices=[ + "LIKELIHOOD_UNSPECIFIED", + "VERY_UNLIKELY", + "UNLIKELY", + "POSSIBLE", + "LIKELY", + "VERY_LIKELY", + ], + help="A string representing the minimum likelihood threshold that " + "constitutes a match.", + ) + parser_create.add_argument( + "--max_findings", + type=int, + help="The maximum number of findings to report; 0 = no maximum.", + ) + parser_create.add_argument( + "--auto_populate_timespan", + type=bool, + help="Limit scan to new content only.", + ) + + parser_list = subparsers.add_parser("list", help="List all triggers.") + parser_list.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + + parser_delete = subparsers.add_parser("delete", help="Delete a trigger.") + parser_delete.add_argument("trigger_id", help="The id of the trigger to delete.") + parser_delete.add_argument( + "--project", + help="The Google Cloud project id to use as a parent resource.", + default=default_project, + ) + + args = parser.parse_args() + + if args.action == "create": + create_trigger( + args.project, + args.bucket, + args.scan_period_days, + args.info_types, + trigger_id=args.trigger_id, + display_name=args.display_name, + description=args.description, + min_likelihood=args.min_likelihood, + max_findings=args.max_findings, + auto_populate_timespan=args.auto_populate_timespan, + ) + elif args.action == "list": + list_triggers(args.project) + elif args.action == "delete": + delete_trigger(args.project, args.trigger_id) diff --git a/dlp/snippets/triggers_test.py b/dlp/snippets/triggers_test.py new file mode 100644 index 000000000000..8bd73db2f959 --- /dev/null +++ b/dlp/snippets/triggers_test.py @@ -0,0 +1,102 @@ +# Copyright 2023 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the 'License'); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an 'AS IS' BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import uuid + +import google.api_core.exceptions +import google.cloud.storage +import pytest + +import triggers + +UNIQUE_STRING = str(uuid.uuid4()).split("-")[0] +GCLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT") +TEST_BUCKET_NAME = GCLOUD_PROJECT + "-dlp-python-client-test" + UNIQUE_STRING +RESOURCE_DIRECTORY = os.path.join(os.path.dirname(__file__), "resources") +RESOURCE_FILE_NAMES = ["test.txt", "test.png", "harmless.txt", "accounts.txt"] +TEST_TRIGGER_ID = "test-trigger" + UNIQUE_STRING + + +@pytest.fixture(scope="module") +def bucket(): + # Creates a GCS bucket, uploads files required for the test, and tears down + # the entire bucket afterwards. + + client = google.cloud.storage.Client() + try: + bucket = client.get_bucket(TEST_BUCKET_NAME) + except google.cloud.exceptions.NotFound: + bucket = client.create_bucket(TEST_BUCKET_NAME) + + # Upoad the blobs and keep track of them in a list. + blobs = [] + for name in RESOURCE_FILE_NAMES: + path = os.path.join(RESOURCE_DIRECTORY, name) + blob = bucket.blob(name) + blob.upload_from_filename(path) + blobs.append(blob) + + # Yield the object to the test; lines after this execute as a teardown. + yield bucket + + # Delete the files. + for blob in blobs: + try: + blob.delete() + except google.cloud.exceptions.NotFound: + print("Issue during teardown, missing blob") + + # Attempt to delete the bucket; this will only work if it is empty. + bucket.delete() + + +def test_create_list_and_delete_trigger(bucket, capsys): + try: + triggers.create_trigger( + GCLOUD_PROJECT, + bucket.name, + 7, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + trigger_id=TEST_TRIGGER_ID, + ) + except google.api_core.exceptions.InvalidArgument: + # Trigger already exists, perhaps due to a previous interrupted test. + triggers.delete_trigger(GCLOUD_PROJECT, TEST_TRIGGER_ID) + + out, _ = capsys.readouterr() + assert TEST_TRIGGER_ID in out + + # Try again and move on. + triggers.create_trigger( + GCLOUD_PROJECT, + bucket.name, + 7, + ["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"], + trigger_id=TEST_TRIGGER_ID, + auto_populate_timespan=True, + ) + + out, _ = capsys.readouterr() + assert TEST_TRIGGER_ID in out + + triggers.list_triggers(GCLOUD_PROJECT) + + out, _ = capsys.readouterr() + assert TEST_TRIGGER_ID in out + + triggers.delete_trigger(GCLOUD_PROJECT, TEST_TRIGGER_ID) + + out, _ = capsys.readouterr() + assert TEST_TRIGGER_ID in out