Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: fix tests against emulator and remove EmulatorCredentials hack #33

Closed

Conversation

tsukasa-au
Copy link

This PR contains many fixes for a myrid of issues with the unit tests for this project.

It started with me wanting to clean up the EmulatorCreds() hack (which appears to be partially removed now...), replacing it with the AnonymousCredentials (which appears to the correct solution).

I then needed to perform a lot of work to make the unit tests pass against the emulator (since it was not clear how expensive this test suite would be to run against a real datastore instance). As you can see there are/were many fixes required.

After all of these changes, I have also run these tests against datastore directly to ensure that I have not broken any tests.

NOTE: I would prefer that these changes are not squashed when they are merged.

Fixes #31 and #32

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label May 18, 2020
The google.cloud.auth package provides a Credentials class for use with
emulators (and other systems that do not require oauth credentials). Use
this rather than rolling out own.

It seems a couple of people around the internet have copied this method
for running code against the emulators. We should ensure that if people
are going to copy these tests, that at least they do something less
hacky.
We remove the requirement for the GOOGLE_APPLICATION_CREDENTIALS if we
detect that we should be running against the local datastore emulator.
This unit test was overwriting the datastore client being used to be a
default constructed client. This was causing it to be constructed with
the default credentials object, which requires that the test is either
run in GCP or with the `GOOGLE_APPLICATION_CREDENTIALS` environment
variable set.
Move away from assertTrue to a helper that outputs the actual value and
the expected value when the test fails.
The last test run in TestDatastoreQuery would fail as it was trying to
remove more than 500 entities at once. We break the deletes up into
batches to work around this limitation of the datastore api.
These tests assume that specific pieces of data exist within datastore,
but nothing in the test created these pieces of data.

Also, the TestDatastoreQuery.test_query_offset_timestamp_keys test
assumes there is >= 10k entities in the datastore, but the create
function would only add 1k entities.
The datastore emulator does not check the size of the response messages
before trying to send them back to the client. For all of the tests that
return large objects, this results in the response exceeding the ~4MiB
grpc message limit (enforces in message_size_filter). The response size
for these tests is ~10MiB.

Since we do not need the actual contents of the entity, we can request
just the keys, via a keys_only query.

Technically this is changing they type of query, which may be considered
testing a different thing. I will leave this up to the maintainers if
they want to merge this change.
There is both a 500 entities per mutation limit, and a ~4MiB
request/response size limit. Update both the populate and clear
libraries to stay below these limits.
@tsukasa-au tsukasa-au changed the title Fix tests against emulator, test python3.8, and remove EmulatorCredentials hack fix: Fix tests against emulator, test python3.8, and remove EmulatorCredentials hack May 18, 2020
@IlyaFaer IlyaFaer added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 20, 2020
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 20, 2020
noxfile.py Outdated
@@ -30,7 +30,7 @@
BLACK_PATHS.append("samples")


@nox.session(python="3.7")
@nox.session(python=["3.7", "3.8"])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pushing these changes!

The things is that noxfiles are auto generated, you can see the disclaimer on the line 17.
That means we should not do any changes manually, as they will be reverted on the next code generation.
On the other hand, adding such a changes into templates from which noxfiles will be generated, means that all Google Cloud clients will be affected. I suppose, let's wait for responsible googler to approve these issues first, as it sounds like a weighty decision.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see this comment (sorry, I skimmed the file and the made a minimal change so that I could actually run the tests).

I am happy to wait for an answer from someone else. I can also remove this specific commit from the PR if you prefer (though once I do, I will be unable to rerun the tests if I am requested to make changes to other parts of this PR).

@tseaver tseaver changed the title fix: Fix tests against emulator, test python3.8, and remove EmulatorCredentials hack testing: fix tests against emulator, test python3.8, and remove EmulatorCredentials hack Aug 5, 2020
@tseaver tseaver changed the title testing: fix tests against emulator, test python3.8, and remove EmulatorCredentials hack testing: fix tests against emulator and remove EmulatorCredentials hack Aug 5, 2020
@tseaver tseaver added the kokoro:run Add this label to force Kokoro to re-run the tests. label Aug 5, 2020
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Aug 5, 2020
@tseaver
Copy link
Contributor

tseaver commented Aug 5, 2020

@tsukasa-au I backed out the Python-version-related changes in noxfile.py (since the version on master now uses 3.8 by default).

@tseaver
Copy link
Contributor

tseaver commented Aug 5, 2020

@tsukasa-au The unit test failures are due to #53: its fix is in PR #54, which should be merged early tomorrowis now merged. To correct the lint failure, please run nox -s blacken in your local clone, and then commit the updated files.

@tseaver tseaver added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 5, 2020
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 5, 2020
@tseaver tseaver added kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed cla: yes This human has signed the Contributor License Agreement. labels Aug 5, 2020
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 5, 2020
@google-cla
Copy link

google-cla bot commented Aug 5, 2020

☹️ Sorry, but only Googlers may change the label cla: yes.

@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 5, 2020
Copy link
Contributor

@tseaver tseaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying black

@@ -101,8 +101,10 @@ def system(session):
system_test_path = os.path.join("tests", "system.py")
system_test_folder_path = os.path.join("tests", "system")
# Sanity check: Only run tests if the environment variable is set.
if not os.environ.get("GOOGLE_APPLICATION_CREDENTIALS", ""):
session.skip("Credentials must be set via environment variable")
is_emulator = 'DATASTORE_DATASET' in os.environ
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is_emulator = 'DATASTORE_DATASET' in os.environ
is_emulator = "DATASTORE_DATASET" in os.environ

is_emulator = 'DATASTORE_DATASET' in os.environ
has_credentials = bool(os.environ.get("GOOGLE_APPLICATION_CREDENTIALS", ""))
if not is_emulator and not has_credentials:
session.skip("Credentials must be set via environment variable for non-emulator tests")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
session.skip("Credentials must be set via environment variable for non-emulator tests")
session.skip(
"Credentials must be set via environment variable for non-emulator tests"
)

@@ -81,10 +81,24 @@ def remove_kind(kind, client):


def remove_all_entities(client):
BATCH_SIZE = 500 # Datastore API only allows 500 mutations in a single call.
KEY_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
KEY_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues)
KEY_BYTES_LIMIT = (
3 << 20
) # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues)

return _estimate_entity_size(value)
result = 0
for key, value in entity.items():
result += len(key) # The number of runes is fine, no point forcing a utf-8 encoding here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result += len(key) # The number of runes is fine, no point forcing a utf-8 encoding here.
result += len(
key
) # The number of runes is fine, no point forcing a utf-8 encoding here.

def add_large_character_entities(client=None):
TOTAL_OBJECTS = 2500
NAMESPACE = "LargeCharacterEntity"
KIND = "LargeCharacter"
MAX_STRING = (string.ascii_lowercase * 58)[:1500]

BATCH_SIZE = 500 # Datastore API only allows 500 mutations in a single call.
RPC_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RPC_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues)
RPC_BYTES_LIMIT = (
3 << 20
) # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues)

while entities:
approx_rpc_bytes = 0
batch = []
while entities and len(batch) < BATCH_SIZE and approx_rpc_bytes < RPC_BYTES_LIMIT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while entities and len(batch) < BATCH_SIZE and approx_rpc_bytes < RPC_BYTES_LIMIT:
while (
entities
and len(batch) < BATCH_SIZE
and approx_rpc_bytes < RPC_BYTES_LIMIT
):

while len(batch) < BATCH_SIZE and key_bytes < RPC_BYTES_LIMIT and all_keys:
batch.append(all_keys.pop())
if batch[-1].name is None:
key_bytes += 9 # It takes 9 bytes for the largest varint encoded number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
key_bytes += 9 # It takes 9 bytes for the largest varint encoded number
key_bytes += (
9 # It takes 9 bytes for the largest varint encoded number
)

batch_size = 500
assert num_batches * batch_size > 10000, 'test_query_offset_timestamp_keys requires at least 10k entries, otherwise it fails'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert num_batches * batch_size > 10000, 'test_query_offset_timestamp_keys requires at least 10k entries, otherwise it fails'
assert (
num_batches * batch_size > 10000
), "test_query_offset_timestamp_keys requires at least 10k entries, otherwise it fails"

@product-auto-label product-auto-label bot added the api: datastore Issues related to the googleapis/python-datastore API. label Aug 21, 2020
@tseaver
Copy link
Contributor

tseaver commented Nov 25, 2020

The core of this PR is superseded by #71 (see #70, which uses AnonymousCredentials anytime the emulator envvar is set, not just when running system tests).

@tseaver tseaver closed this Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/python-datastore API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unit tests are not run against python3.8
5 participants