Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batches in >=2.0.0 are significantly slower #145

Closed
BinarSkugga opened this issue Jan 6, 2021 · 9 comments · Fixed by #155
Closed

Batches in >=2.0.0 are significantly slower #145

BinarSkugga opened this issue Jan 6, 2021 · 9 comments · Fixed by #155
Assignees
Labels
api: datastore Issues related to the googleapis/python-datastore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@BinarSkugga
Copy link

BinarSkugga commented Jan 6, 2021

I believe there is a change of API or a performance issue when using batches with 2.1.0.

Environment details

  • OS type and version: Windows 10
  • Python version: 3.8.7
  • pip version: 20.3.3
  • google-cloud-datastore version: 2.1.0
  • google-api-core version: 1.24.1

Steps to reproduce

  1. Use DatastoreClient::put_multi with 500 entities inside a batch

Code example

with client.batch():
    client.put_multi(chunk)

Performance

When using 1.15.0, my code can handle 9 batches of 500 entities in about 1-2 seconds while 2.1.0 takes literal minutes.
Disclaimer: I use the datastore emulator

EDIT: It can be reproduced in 2.0.0

@product-auto-label product-auto-label bot added the api: datastore Issues related to the googleapis/python-datastore API. label Jan 6, 2021
@BinarSkugga BinarSkugga changed the title Batches in 2.10 are significantly slower Batches in >2.0.0 are significantly slower Jan 6, 2021
@BinarSkugga BinarSkugga changed the title Batches in >2.0.0 are significantly slower Batches in >=2.0.0 are significantly slower Jan 6, 2021
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Jan 7, 2021
@dmahugh dmahugh removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Jan 11, 2021
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Jan 11, 2021
@crwilcox
Copy link
Contributor

Are you seeing this in both emulator and live endpoints?

@crwilcox crwilcox added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jan 11, 2021
@yoshi-automation yoshi-automation removed triage me I really want to be triaged. 🚨 This issue needs some love. labels Jan 12, 2021
@BinarSkugga
Copy link
Author

I haven't tested on a live endpoint. I'm in vacation for the week, I'll definitely test this when I come back.

@crwilcox
Copy link
Contributor

Thanks @BinarSkugga . I know we overhauled, rather substantially, the transport layer in the v1 -> v2. It would be very helpful to understand if you are seeing a slowdown on a live endpoint.

@BinarSkugga

This comment has been minimized.

@BinarSkugga
Copy link
Author

BinarSkugga commented Jan 18, 2021

Additional potentially impactful information:

  • The emulator doesn't log any errors.
  • I use those batches in parrallel using Python's ThreadPoolExecutor from the concurrent package.
  • Deleting entities seems also slightly slower but its negligible compared to writes and reads.
  • Might be worth to note that I had to remove the 'indices' key in my Entity object as the emulator doesn't seem to understand it. It tries to parse it as a column and fails because it contains a set().

@BinarSkugga
Copy link
Author

BinarSkugga commented Jan 19, 2021

I may have talked to fast. It also happens on live endpoints.
Logs from 1.15.x:

[SOLLUM][DATA][INFO][10:42:35]: Fetching 4130 'LayoutEntity' entities...
[SOLLUM][DATA][VERBOSE][10:42:35]: Using 9 'LayoutEntity' batches...
[SOLLUM][DATA][VERBOSE][10:42:36]: Fetched 'LayoutEntity' batch #1/9
[SOLLUM][DATA][VERBOSE][10:42:36]: Fetched 'LayoutEntity' batch #2/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #4/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #3/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #5/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #6/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #7/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #8/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #9/9

Logs from 2.1.x:

[SOLLUM][DATA][INFO][14:35:42]: Fetching 4130 'LayoutEntity' entities...
[SOLLUM][DATA][VERBOSE][14:35:42]: Using 9 'LayoutEntity' batches...
[SOLLUM][DATA][VERBOSE][14:36:48]: Fetched 'LayoutEntity' batch #1/9
[SOLLUM][DATA][VERBOSE][14:39:31]: Fetched 'LayoutEntity' batch #2/9
[SOLLUM][DATA][VERBOSE][14:39:32]: Fetched 'LayoutEntity' batch #3/9

Here's the code that handles this:

def _do_fetch_chunk(chunk, client, kind, count, total):
    with client.batch():
        ents = client.get_multi(chunk)
        Loggers.DATA.verbose('Fetched \'{}\' batch #{}/{}'.format(kind, count.next(), total))
        return ents

count = AtomicNumber()
chunked = list(chunks(keys, config.MAX_BATCH_ENTITY))
Loggers.DATA.verbose('Using {} \'{}\' batches...'.format(len(chunked), kind))
entities = ConcurrentService.i().thread_each(chunked, _do_fetch_chunk, self.client, kind, count, len(chunked))
return [item for sublist in entities for item in sublist]

The concurrent service simply submit a new task per batch in the thread pool.

@BinarSkugga
Copy link
Author

Any insight on this ?

@crwilcox
Copy link
Contributor

crwilcox commented Mar 22, 2021

I have received another customer report. While not quite the difference stated here, 2-3x slower is about what the other instance was observing. It would be good to benchmark a few operations and see if there are any obvious gains that could be made.

Reference PR from PubSub that was able to get back some time: googleapis/python-pubsub@c29d7f8

@craiglabenz
Copy link
Contributor

I've been attempting to recreate this, but so far am seeing more or less similar outcomes for batch operations between versions 2.1.0 and 1.15.3.

Querying may very well be slower, however.

First test run:

$ python datastore-test.py                         
[python-datastore@1.15.3] finished creation loop in 5.0 ms
[python-datastore@1.15.3] finished batch write in 1926.0 ms
[python-datastore@1.15.3] finished batch get in 391.0 ms
[python-datastore@1.15.3] finished batch update in 1706.0 ms
[python-datastore@1.15.3] finished batch query in 2216.0 ms
  > loaded 3500 objects

Switch to version 2.1.0 and run again:

python datastore-test.py              
[python-datastore@2.1.0] finished creation loop in 4.0 ms
[python-datastore@2.1.0] finished batch write in 2109.0 ms
[python-datastore@2.1.0] finished batch get in 588.0 ms
[python-datastore@2.1.0] finished batch update in 1701.0 ms
[python-datastore@2.1.0] finished batch query in 3702.0 ms
  > loaded 4000 objects

I will continue to investigate, but for now, here's the testing script I'm using.

tseaver pushed a commit that referenced this issue Apr 20, 2021
More efficiently uses proto-plus wrappers, as well as inner protobuf attribute access, to greatly reduce the performance costs seen in version 2.0.0 (which stemmed from the introduction of proto-plus).

The size of the performance improvement scales with the number of attributes on each Entity, but in general, speeds once again closely approximate those from 1.15.

Fixes #145
Fixes #150
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/python-datastore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants