Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add caching to GapicCallable #527

Merged
merged 28 commits into from May 3, 2024
Merged

Conversation

daniel-sanche
Copy link
Contributor

@daniel-sanche daniel-sanche commented Sep 6, 2023

_GapicCallable currently does a lot of work on each call, re-building each wrapped function, using a lot of calls to helper functions. This cost is added to every single rpc, so it can really add up

This PR does the following optimizations

  • removes the _apply_decorators and _is_not_none_or_false to build the wrapped call more directly.
    • This seems to make it ~10% faster
  • add a new helper that builds a wrapped call using a timeout and retry object, and then cache the result with @lru_cache
    • This seems to make it 50% faster
    • In practice, I think it's safe to assume most calls will be re-using timeout and retry values
    • I currently have the cache size of 4, but this can be changed

Benchmark:

from google.api_core.gapic_v1.method import _GapicCallable
from google.api_core.retry import Retry

callable =  _GapicCallable(lambda *a, **k: 1, retry=Retry(), timeout=1010, compression=False)

from timeit import timeit
timeit(lambda: callable())

Before: 20.43s
After: 9.48s

@product-auto-label product-auto-label bot added the size: s Pull request size is small. label Sep 6, 2023
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Feb 10, 2024
@daniel-sanche daniel-sanche marked this pull request as ready for review February 10, 2024 01:35
@daniel-sanche daniel-sanche requested review from a team as code owners February 10, 2024 01:35
@daniel-sanche daniel-sanche changed the title [DRAFT] feat: optimize GapicCallable feat: add caching to GapicCallable Feb 10, 2024
@vchudnov-g vchudnov-g self-assigned this Feb 12, 2024
Copy link
Collaborator

@parthea parthea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to make it 50% faster

Thanks for fixing this!

Please could you add a simple benchmarking presubmit, similar to the test that you ran manually, to avoid future regressions in performance?

@daniel-sanche
Copy link
Contributor Author

Sure, I just added a unit test. Let me know if that works

Copy link
Collaborator

@parthea parthea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please wait for @vchudnov-g to review

@parthea
Copy link
Collaborator

parthea commented Feb 13, 2024

We may need a larger buffer to ensure that the test is not flaky but still captures regressions. Perhaps double the value?

https://github.com/googleapis/python-api-core/actions/runs/7893470370/job/21542170780?pr=527

>       assert avg_time < 0.15  # expect ~0.1, but allow for some variance
E       assert 0.17590151499999251 < 0.15

@parthea
Copy link
Collaborator

parthea commented Feb 14, 2024

Assigning back to @daniel-sanche to resolve the presubmit failure

@daniel-sanche
Copy link
Contributor Author

Hmm good point, the benchmark result will be machine-specific, and I was doing my tests locally instead of with the CI workers.

I guess I'll have to find an assertion value that works well for the CI nodes, and I'll add a comment explaining that it may be flake on slower hardware. Or let me know if you have other suggestions for how to approach this

@parthea
Copy link
Collaborator

parthea commented Feb 14, 2024

Can you set it high enough that we don't get flaky results, but low enough that we can detect performance regressions.

Perhaps set the threshold to 0.4 for now and create an issue in https://github.com/googleapis/python-api-core/issues to add a proper benchmarking test ? I believe @ohmayr started looking into a benchmarking presubmit so please tag him on the issue.

@daniel-sanche
Copy link
Contributor Author

Sure, I opened an issue to track this here: #616

I adjusted the value to 0.4. Feel free to merge it with that number, but I suspect we can find a lower value that still avoids flakiness. Let me know if you want me to do some investigation

@parthea parthea assigned vchudnov-g and unassigned daniel-sanche Feb 27, 2024
@parthea
Copy link
Collaborator

parthea commented Feb 27, 2024

@vchudnov-g Please could you review?

Copy link
Contributor

@vchudnov-g vchudnov-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor code comment, and an idea about tightening benchmarks.

Note: The threshold has been tuned for the CI workers. Test may flake on
slower hardware

https://github.com/googleapis/python-api-core/pull/527
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to self-reference this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentional, to give the context on this test. But on second thought, git blame should be enough. Removed

lambda *a, **k: 1, retry=Retry(), timeout=1010, compression=False
)
avg_time = timeit(lambda: gapic_callable(), number=10_000)
assert avg_time < 0.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: If the assertion fails, print both the actual time it took and enough platform information so that in the future we can add the right threshold for the platform. The latter would be something like this

platform_threshold = { "foo": 0.2, "bar": 0.6 }
current_platform = ...
...
assert avg_time < platform_threshold.get(current_platform, 0.4)

In fact, you could implement platform_threshold now, and start with whatever your current machine is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting idea, but it's not completely clear to me what we'd need to capture for the platform. Number of CPUs? Architecture? OS? Let me know if you have thoughts

We already have #616 to track improving this though, so if it's alright with you, I'll merge this as-is and we can discuss follow-up there

@daniel-sanche daniel-sanche merged commit d96eb5c into main May 3, 2024
27 checks passed
@daniel-sanche daniel-sanche deleted the optimize_gapic_callable branch May 3, 2024 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants