Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Race conditions and performance issues #237

Merged
merged 2 commits into from Sep 14, 2021
Merged

Conversation

dpcollins-google
Copy link
Collaborator

There are two main retrying_connection race conditions fixed here:

  1. Improper handling of cancelled write tasks can cause set_exception to be called when the task is already cancelled, which raises an InvalidStateError which is never caught by the existing code.
  2. There is a race where if reinitialize() is called after queues are cycled, meaning a poller from the old instance of the class can add a message to the new queues. This has been fixed by splitting the ConnectionReinitializer interface into "stop_processing" and "reinitialize" parts.

Also fix other performance issues identified in profiles.

Something close to this code has been running for an hour or two on compute engine with no hangs.

There are two main retrying_connection race conditions fixed here:

1) Improper handling of cancelled write tasks can cause set_exception to be called when the task is already cancelled, which raises an InvalidStateError which is never caught by the existing code.
2) There is a race where if reinitialize() is called after queues are cycled, meaning a poller from the old instance of the class can add a message to the new queues. This has been fixed by splitting the ConnectionReinitializer interface into "stop_processing" and "reinitialize" parts.

Also fix other performance issues identified in profiles.
There are two main retrying_connection race conditions fixed here:

1) Improper handling of cancelled write tasks can cause set_exception to be called when the task is already cancelled, which raises an InvalidStateError which is never caught by the existing code.
2) There is a race where if reinitialize() is called after queues are cycled, meaning a poller from the old instance of the class can add a message to the new queues. This has been fixed by splitting the ConnectionReinitializer interface into "stop_processing" and "reinitialize" parts.

Also fix other performance issues identified in profiles.
@product-auto-label product-auto-label bot added the api: pubsublite Issues related to the googleapis/python-pubsublite API. label Sep 14, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Sep 14, 2021
@dpcollins-google dpcollins-google merged commit ec76272 into main Sep 14, 2021
@dpcollins-google dpcollins-google deleted the fix-races-on-main branch September 14, 2021 19:13
gcf-merge-on-green bot pushed a commit that referenced this pull request Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsublite Issues related to the googleapis/python-pubsublite API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants