Skip to content
This repository has been archived by the owner on Dec 10, 2023. It is now read-only.

Example of of Crypto Deterministic Config using https://cloud.google.… #119

Merged
merged 1 commit into from Apr 1, 2021
Merged

Conversation

hilliao
Copy link
Contributor

@hilliao hilliao commented Mar 25, 2021

…com/dlp/docs/pseudonymization#supported-methods to resolve #108

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<108> 🦕

@snippet-bot
Copy link

snippet-bot bot commented Mar 25, 2021

Here is the summary of changes.

You are about to add 2 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added the api: dlp Issues related to the googleapis/python-dlp API. label Mar 25, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Mar 25, 2021
@hilliao hilliao marked this pull request as ready for review March 25, 2021 18:04
@hilliao hilliao requested a review from a team as a code owner March 25, 2021 18:04
@hilliao hilliao requested review from tmatsuo and removed request for a team March 25, 2021 18:04
Copy link
Contributor

@tmatsuo tmatsuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with few small suggestions

samples/snippets/deid.py Outdated Show resolved Hide resolved
samples/snippets/deid.py Outdated Show resolved Hide resolved
samples/snippets/deid.py Outdated Show resolved Hide resolved
samples/snippets/deid.py Outdated Show resolved Hide resolved
@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

In case you don't have access to the build log, here is the lint output:

./deid.py:471:1: E303 too many blank lines (3)
# [START dlp_reidentify_deterministic]
^
./deid.py:472:1: E302 expected 2 blank lines, found 3
def reidentify_with_deterministic(
^

@hilliao
Copy link
Contributor Author

hilliao commented Apr 1, 2021

I fixed the 5 places and fixed parameters in the unit tests. I had to push an amended commit as new commits failed https://www.conventionalcommits.org/en/v1.0.0/ checks. Should new commit's comment be identical to prior commit's comment? What's the rule @tmatsuo ?

@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

@hilliao
The conventional commit check seems good now.

@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

@hilliao
Copy link
Contributor Author

hilliao commented Apr 1, 2021

I can see the build error but I don't have access to test_deidentify_with_deterministic 's output to see the output of the tokenized harmful string. I believe the error from the build is caused by using my own KMS key instead of projects/python-docs-samples-tests/locations/global/keyRings/dlp-test/cryptoKeys/dlp-test which I don't have access to.
build error:
debug_error_string = "{"created":"@1617263155.197209081","description":"Error received from peer ipv4:74.125.197.95:443","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"Could not transform all content due to transformation errors. To force this request to succeed anyway (by skipping any transformations that cause errors), settransformation_error_handling.modetoleave_untransformedin the DeidentifyConfig; this will return the partially transformed content along with an error summary.\n
Lack of access:

echo -ne $RAW_KEY | gcloud kms encrypt --key $KEY_NAME --keyring $KEY_RING_NAME --location global --plaintext-file=- --ciphertext-file wrapped.key --project python-docs-samples-tests && cat wrapped.key | base64
ERROR: (gcloud.kms.encrypt) PERMISSION_DENIED: Permission 'cloudkms.cryptoKeyVersions.useToEncrypt' denied on resource 'projects/python-docs-samples-tests/locations/global/keyRings/dlp-test/cryptoKeys/dlp-test' (or it may not exist).

I believe the tokenized string is in the output of test_deidentify_with_deterministic but I can't find its output from the build error. Can you tell me the tokenized string or the whole build output so I can update the code line?

labeled_fpe_string = "My SSN is SSN_TOKEN(36):AUGFopzf7RiKAHWD2Sm5GJWS0Yfezsbc1OU="

@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

@hilliao I got it. I'll try to add a commit.

@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

Re: conventional commits
I also got an error. My guess is our bot can not parse your commit message well. I squashed the commit.

@tmatsuo tmatsuo added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 1, 2021
@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

Now the tests are passing. It might be better to get the string dynamically with a pytest fixture.

@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

@hilliao WDYT?

@hilliao
Copy link
Contributor Author

hilliao commented Apr 1, 2021

I would like to merge this pull request first; then I will create a new issue to change the re-identification test method to de-identification then re-identification in the test method. Can @tmatsuo merge?

@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

@hilliao Sounds good

@tmatsuo
Copy link
Contributor

tmatsuo commented Apr 1, 2021

@hilliao Thanks for your contribution!

@tmatsuo tmatsuo merged commit 396804d into googleapis:master Apr 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: dlp Issues related to the googleapis/python-dlp API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

missing crypto_deterministic_config example to re-identify
3 participants