Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PYTHON-4264 Async PyMongo Beta #1629

Open
wants to merge 92 commits into
base: master
Choose a base branch
from
Open

Conversation

NoahStapp
Copy link
Contributor

@NoahStapp NoahStapp commented May 2, 2024

Remaining work to be done before beta release (in no particular order):

  • Add docstring translation from async to sync
  • Fix no module named pymongo.asynchronous/synchronous errors in release scripts
  • Fix slotscheck ERROR: Module 'grid_file' not found error
  • Automate running import sorting on synchronous files after generation
  • Fix doc links (either change all import paths or export sync/async modules at the top level of the package)
  • Add async test suite (likely a mix of manual and automated at first until we can write a whole async test runner)
  • Automate synchro.py running as part of pre-commit hook

Since there are over 30k lines changed, reviewing the changes line-by-line is not feasible. I've added several comments to files in this PR to explain choices or the important parts to look over closely, but we'll have to rely on our test suite to do the bulk of ensuring correctness.

Additionally, as shown by the list above, there is still remaining work to be done before the beta is fully ready for release. I've elected to mark this PR as ready for review now to give people plenty of time now that the tests are fully passing. There aren't any planned fundamental changes remaining, only fixing the last few parts of our infrastructure and related tooling.

I plan to start writing up the design document this week, which should help answer any additional questions raised as part of the review process.

For reviewers

The most important code to look at closely is pymongo/network.py, pymongo/asynchronous/mongo_client.py, pymongo/asynchronous/cursor.py, and pymongo/asynchronous/command_cursor.py. The first is one of the only low-level components that contains both async and sync code, while the others all use IS_SYNC for behavior unsupported by magic methods like __init__. Reading those files will give a good understanding of how the rest of the async code is structured.

The goal of the asynchronous sub-module is to be at feature parity with Motor, while the synchronous sub-module should be at parity with the existing PyMongo. Our test suite should cover almost all of the work of ensuring parity, so my biggest concern is code readability and maintainability.

All of our work going forward will have to be asynchronous, adding another layer of complexity and verbosity. Minimizing this additional overhead is a priority whenever possible, especially since it can be unclear when a method does or does not need to be asynchronous.

test/__init__.py Outdated
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently test only the synchronous code for parity with the existing synchronous PyMongo release. We'll need to also test the asynchronous code going forward. Our current plan is to add this test suite during the beta testing period. One proposal is to rewrite the test suite as async and translate it using unasync to sync, mirroring the actual PyMongo codebase.


[mypy-pymongo.synchronous.*,gridfs.synchronous.*]
warn_unused_ignores = false
disable_error_code = unused-coroutine
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several places use IS_SYNC to call either the sync or async version of a network-level operation. This causes linting errors in the synchronous codebase due to the async coroutine for the IS_SYNC = False path missing an await.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file and several others don't contain any async code, but still import files that do for either type hints or helper methods/constants. Moving them outside of the asynchronous sub-module causes typing issues and test failures due to pymongo.asynchronous.ClassName not being identical to pymongo.synchronous.ClassName.

synchro.py Outdated
Copy link
Contributor Author

@NoahStapp NoahStapp May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unasync is a very small and simple library that effectively just does token replacement for async/await keywords and known equivalents like __anext__ -> __next__. The names inside replacements here are mapped from async to sync on a token-by-token basis.

This relatively limited functionality means any asynchronous code that uses async-only features that do not have a synchronous parallel will require an IS_SYNC block to implement the synchronous version.

return cast(F, inner)


async def anext(cls: Any) -> Any:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anext was added in Python 3.10, so we need to use our own version here for compatibility with Python 3.8 and 3.9. anext is one of the special names that unasync automatically converts to the sync version, which is next in this case.

@blink1073
Copy link
Member

We need to add the subpackages to [tool.setuptools.packages.find] in pyproject.toml.

@blink1073
Copy link
Member

Can you please move synchro.py under tools?

@NoahStapp NoahStapp requested a review from blink1073 May 9, 2024 18:37
@blink1073
Copy link
Member

The release scripts work now, I updated the PR checklist.

test/utils.py Outdated
if time.time() - start > timeout:
raise AssertionError("Didn't ever %s" % success_description)

time.sleep(interval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an asyncio.sleep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants