Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transactions via an internal global session var #2569

Open
wants to merge 48 commits into
base: master
Choose a base branch
from

Conversation

juannyG
Copy link

@juannyG juannyG commented Sep 25, 2021

An implementation of one of my proposals, resolves #2248 - Add support for transactions using global sessions

  • Maintain state of session when entering and exiting transaction
  • Update any collection methods used in mongoengine that accept session
  • Update any database methods used in mongoengine that accept session
  • Cascades should be accounted for as those methods are wrappers which call the
    same low-level pymongo collection APIs updated above
  • Any signals that try to read or write should also be wrapped in the
    transaction

Still lots of questions...
* Maintain state of session when entering and exiting transaction
* Update any collection methods used in mongoengine that accept session
* Update any database methods used in mongoengine that accept session
* Cascades should be accounted for as those methods are wrappers which call the
  same low-level pymongo colletion APIs updated above
* Any signals that try to read or write should also be wrapped in the
  transaction
@juannyG
Copy link
Author

juannyG commented Sep 25, 2021

Something important for us to keep in mind: pymongo is extremely upfront that sessions are neither fork-safe nor thread-safe.

At minimum the docs would need to mirror this, but from an implementation stand point, minor protections could be added here by leveraging python thread's local storage.

That said...it wouldn't actually provide any real protections in a threaded/forked environment, given that behind the scenes pymongo is still the one in control.

@juannyG
Copy link
Author

juannyG commented Sep 30, 2021

Whoops - forgot to give travis the "replicaSet" treatment!

@juannyG
Copy link
Author

juannyG commented Sep 30, 2021

OK - I see what's going on - per the docs

In MongoDB 4.2 and earlier, you cannot create collections in transactions. Write operations that result in document inserts
(e.g. insert or update operations with upsert: true) must be on existing collections if run inside transactions.

Starting in MongoDB 4.4, you can create collections in transactions implicitly or explicitly.

So - the test suite is accurate! 😄 I'm testing against a more recent version of Mongo, which is why the tests pass locally.

I think it makes sense to bubble up the exception to a user that they're violating the expectations of the version of Mongo they're using. Should I just have a conditional assertion in my tests that this case should throw an exception for versions<4.4 but >=4.4 the operation is allowed and successful?

@juannyG
Copy link
Author

juannyG commented Sep 30, 2021

Digging through the repo, looks like this would be an acceptable approach.

* Mongo 3 doesn't support transactions
* Mongo 4 does, but 4.4 specifically started allowing colleciton creation in a transaction
This update accounts for the fact that at least one workflow
runs against Mongo 4.0.23
@irtzasuhail
Copy link

is there an estimate on when this might be merged into master?

pymongo added support for transactions in 3.7
@bagerard
Copy link
Collaborator

bagerard commented Nov 6, 2021

Apologize for the delay, I had a first look at the proposal and it seems to be going in the right direction.
Biggest problem I see right now is that it is not thread safe, using the run_in_transaction context manager in 2 different threads would be overwriting the global _session, and would be mixing things up badly.

From what I understand the Session is indeed documented as not thread safe, so you can't share a same Session among different threads, but you can let different threads instantiate their own session. So using thread locals should do the trick here. Each thread would create its own Session as needed. We would also need a test that simulates this (I would mix ~ 10 threads, with random delays, some of them saving objects, some of them aborting due to an exception), then assert at the end that only expected items were saved.

mongoengine/connection.py Outdated Show resolved Hide resolved
@juannyG
Copy link
Author

juannyG commented Nov 7, 2021

No apologies necessary! Thanks for the feedback - I'll try to address them as soon as I can.

Great idea about thread locals, btw!

@juannyG juannyG marked this pull request as draft November 7, 2021 13:20
Use an extension of `threading.locals` to handle initialized sessions within a
thread to ensure session isolation from other threads.

Also, given that pymongo supports nested sessions + nested transactions, I added
a deque to track the "current" session state and mimic this behavior.
@juannyG juannyG marked this pull request as ready for review November 10, 2021 23:36
mongoengine/connection.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @juannyG, PyMongo maintainer here, this is great work! I'm adding a few thoughts on this feature and how it can be improved in the future.

  1. threadlocal is good but using contextvars would be better. Since contextvars requires Python 3.7+ we can rewrite it once mongoengine drops support for 3.6.

  2. It would be good to add an api to set a session without starting a transaction, that way users can use sessions without transactions. This can be added as a follow on feature. like:

    with run_with_session() as session:
        Model1.objects.get()
        Model2.objects.get()
  3. The current implementation will reuse the local session even when using a different connection alias. I believe this is fine since pymongo will raise an error but it would be great to document the limitation so users have a reference point when they encounter this error.

  4. I do think we should seriously consider abstracting session.with_transaction (added in pymongo 3.9) rather than start_transaction(), otherwise mongoengine users will need to write complex retry logic. We could add this version of run_in_transaction and later introduce a second method run_with_transaction but I believe it will be confusing to have two apis. The big difference api-wise is that with_transaction accepts a callable instead of being a context manager. I propose we merged this PR and then update the run_in_transaction api in a follow PR to use with_transaction, like this:

    def run_in_transaction(
        callback, alias=DEFAULT_CONNECTION_NAME, session_kwargs=None, transaction_kwargs=None
    ):
        conn = get_connection(alias)
        session_kwargs = session_kwargs or {}
        with conn.start_session(**session_kwargs) as session:
            transaction_kwargs = transaction_kwargs or {}
            transaction_kwargs["callback"] = callback
            _set_session(session)
            try:
                session.with_transaction(**transaction_kwargs):
            finally:
                _clear_session()

    Then the updated api would be used like this:

    def callback():
        a_doc.update(name="a2")
        b_doc.update(name="b2")
    
    run_in_transaction(callback)

@@ -264,6 +267,8 @@ def disconnect(alias=DEFAULT_CONNECTION_NAME):
if alias in _connection_settings:
del _connection_settings[alias]

_local_sessions.clear_all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this? I would expect the session to only be removed from threadlocal when exiting the run_in_transaction block. Otherwise a session could unexpectedly disappear within a block.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah...in retrospect I think I was trying to protect from the very weird use case of: someone calls disconnect inside of a transaction.

This is definitely over board for that use case. I think, instead, it's OK to say "don't call disconnect in a transaction - this will result in undefined behavior"

@contextmanager
def run_in_transaction(
alias=DEFAULT_CONNECTION_NAME, session_kwargs=None, transaction_kwargs=None
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a docstring here would be nice. (Could be done in a follow up PR.)

@@ -67,7 +71,7 @@ def count_documents(
def list_collection_names(db, include_system_collections=False):
"""Pymongo>3.7 deprecates collection_names in favour of list_collection_names"""
if PYMONGO_VERSION >= (3, 7):
collections = db.list_collection_names()
collections = db.list_collection_names(session=connection._get_session())
else:
collections = db.collection_names()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing session here.

assert "b2" == B.objects.get(id=b_doc.id).name

@requires_mongodb_gte_44
def test_collection_creation_via_upsersts_across_databases_in_transaction(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upsersts -> upserts

pass
except pymongo.errors.OperationFailure as op_failure:
"""
If there's a TransientTransactionError, retry - the lock could not be acquired.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is automatically handled by session.with_transaction (added in pymongo 3.9) and is a good example of why we probably want to use with_transaction() rather than start_transaction(). Otherwise we'll be forcing mongoengine users into the painful act of diagnosing these errors and writing this retry logic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - to be frank, this test was a pain 🤣

@juannyG
Copy link
Author

juannyG commented Aug 25, 2023

Thank you so much for the feedback @ShaneHarvey

Great call about using with_transaction. If you're OK with it, I might just rewrite it now to use with_transaction. Really the "rewrite" will be in the tests to adhere to the new contract. This feels like it might be the right way to kick off transactions in mongoengine - it's cleaner...unlike that unpleasant test I wrote :)

I do see an implication with this approach: if you want to use mongoengine transactions, you have to be using pymongo>=3.9 - is that OK?

* Don't clear out all sessions when disconnecting one alias
* Add missing session kwargs
* Fix `upserts` typo
Include warning about trying to run a transaction across different connections
@juannyG
Copy link
Author

juannyG commented Aug 26, 2023

I believe I've addressed the feedback provided, but do let me know if anything else comes to mind.

I'm more than happy to do a follow up PR modifying the API to use with_transaction as I completely agree that it is going to be significantly more friendly.

My only concern with doing it as a follow up is that it would be a non-backwards compatible change and the the implication I mentioned above (pymongo>=3.9 is a requirement).

Looking forward to your thoughts!

@idoshr
Copy link
Contributor

idoshr commented Sep 15, 2023

This PR #2768
Fix the issue of transaction session and transaction between DB

@ShaneHarvey
Copy link
Contributor

ShaneHarvey commented Nov 2, 2023

If you're OK with it, I might just rewrite it now to use with_transaction. Really the "rewrite" will be in the tests to adhere to the new contract. This feels like it might be the right way to kick off transactions in mongoengine - it's cleaner...unlike that unpleasant test I wrote :)

Yes I totally agree that it's better done in this PR to avoid breaking changes or code churn.

I do see an implication with this approach: if you want to use mongoengine transactions, you have to be using pymongo>=3.9 - is that OK?

This is okay. You could even check the pymongo version at runtime (like mongoengine does here) and give a nice error message if you like.

@juannyG
Copy link
Author

juannyG commented Nov 4, 2023

I've completed the modifications so that we now abstract with_transaction as opposed to start_transaction. If run_in_transaction is called and pymongo<3.9 is being used, we raise a mongoengine OperationError and unit tests are skipped when pymongo<3.9 is active.

The other additional change I made was to move run_in_transaction out of context_managers.py because it's no longer a context manager. Instead, I put it closer to the newly added session stuff in connection.py (and updated the tests accordingly, as well).

Let me know if anything else comes to mind!

Copy link
Contributor

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Thanks for all the effort @juannyG!

@bagerard what do you think?

is now how mongod is run for the test suites for transactions.
"""
connect(
"mongoenginetest2", alias="t2", host="127.0.0.1", directConnection=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change still needed? It seems unrelated to the run_in_transaction feature.

Copy link
Author

@juannyG juannyG Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test I needed to make a change in order for it to pass. It seemed to me that we needed to account for how pymongo deals with connections to replica sets (which is needed for transactions) as of >= 4.0 https://pymongo.readthedocs.io/en/stable/migrate-to-pymongo4.html#mongoclient

Without it, this assertion fails because mongo_connections["t2"].address[0] is equal to localhost

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks for the explanation. The issue is that without directConnection=True, the client will discover the real server address of the replica set member so mongo_connections["t2"].address[0] is equal to localhost not 127.0.0.1. This seems fine, although the test could be rewritten to just do this instead of checking the address property:

assert mongo_connections["t1"] is not mongo_connections["t2"]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 great call. I went ahead and made this update as well

@bagerard
Copy link
Collaborator

again apologize for the delay, I'm finally getting back to this.

@ShaneHarvey I find the "callback api" (i.e session.with_transaction(callback)) really awkward, this is typically what context managers are for...

I agree that the retry logic is great to abstract from end users but Pymongo should provide the most flexible and pythonic approach. Using callback and lambda's also interfer with mypy so unless I'm missing something it really provides no advantages...

In fact if Pymongo would offer the "context manager API" (that would include the retry logic), we could easily write a "callback" wrapper in just a few lines of Python but we can't do the opposite and derive a context manager out of the "callback api" which is a pity...

@juannyG
Copy link
Author

juannyG commented Dec 18, 2023

@bagerard Thank you for the feedback.

Would you prefer the implementation I had prior to the callback modification - or is there something else you have in mind?

@ShaneHarvey
Copy link
Contributor

ShaneHarvey commented Dec 18, 2023

PyMongo does have a context manager API but the drawback is that is not possible to have retries with such an API as mentioned above. Retries are not possible because the context block can only ever be run once; a context manager can not re-run the block it contains. This is why we built the with_transaction api that accept a callback that can be executed multiple times if needed.

@bagerard
Copy link
Collaborator

I missed that it was actually re-calling the method 😬 , I understand now, thanks for (re)-clarifying.

In practice when you just chain a bunch of pure MongoDB operations, using the callback thing is a perfect fit. When used in a more involved application, like a complex web application (which is what we are building on top of MongoEngine at work), I don't think it would be a sane default to just retry a chain of calls, simply failing the overall transaction and API call sounds like a much much safer thing to do (otherwise you must be careful that first execution is not going to leave stuff behind that could interfer with the second execution). Other obvious issue with this automatic re-call, is that pymongo is silently doing that, without even logging the error or the fact that it's retrying 😬 , again makes sense for a chain of pure db operations, but a nightmare to debug in a more complex setup.

Long story short, I think it's a must to provide a simple context manager API for opening the transaction in MongoEngine, without any default retry logic. Next to that we can also support the "callback-api" but it's a nice to have IMO.

@ShaneHarvey
Copy link
Contributor

I don't think it would be a sane default to just retry a chain of calls

I understand this concern but I don't think this is as big of a concern as you say since the primary use case for transaction is a bunch of pure MongoDB operations. Doing other work inside the transaction extends its lifespan and increases the chance of write conflicts. Also, basically all users will need to write retry logic so the same problem still exists in either API.

Other obvious issue with this automatic re-call, is that pymongo is silently doing that, without even logging the error or the fact that it's retrying 😬 , again makes sense for a chain of pure db operations, but a nightmare to debug in a more complex setup.

We are actually working on adding logging into PyMongo right now: https://jira.mongodb.org/browse/PYTHON-3113

The huge benefit to using the callback approach is that it makes applications much more resilient to transient errors (which are very common when using transactions). Users will be much less likely to have to write custom and error prone transaction retry logic. By opting for a context api you'll end up with a lot of users scratching their heads diagnosing transient write conflict errors.

I would say that if we do end up adding two APIs, then the callback based one should still be the recommended approach in the docs and examples. The context based one could be a "power user" feature for those who want to customize their own retry behavior.

transaction_kwargs["callback"] = callback
_set_session(session)
try:
session.with_transaction(**transaction_kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyMongo's with_transaction returns the callback's value so I believe this should be return session.with_transaction(...). Otherwise it becomes unwieldy to return a value.

@ShaneHarvey
Copy link
Contributor

One other idea: if you're very concerned about the retry behavior, one alternative is to expose a configurable knob like this:

run_in_transaction(callback, retry_enabled=False/True)

That would at least let you expose a single txn api instead of 2 different ones. I'm ambivalent but it's worth a thought.

@bagerard
Copy link
Collaborator

I'm expecting MongoEngine to be used mostly in web applications and users will be opening their transactions quite early, wrapping all their business logic. In the middle of the chain of calls we may have disk operations, network calls, use of globals/threads/threadlocals, stateful objects, etc. Safe repetition of call is not something you get out of the box so I can't recommend that setup in a general purpose ORM like MongoEngine.

That being said, I don't have much experience with transactions in Mongo, is there anything in MongoDB transactions that is different from SQL transactions and that would make them fail more often? We don't see such callback api with repeated calls for dealing with transactions in extremely popular framework like SQLAlchemy or Django, how do you explain this?

@elormkoto
Copy link

Hi yall. is there any update on this MR? Is there anything I can do to help push this along?

@juannyG
Copy link
Author

juannyG commented Mar 5, 2024

Hi yall. is there any update on this MR? Is there anything I can do to help push this along?

I was waiting to see if there would be a response to @bagerard 's last question - it seemed to me that we were still waiting on alignment. All that said: @bagerard to keep things moving, I can roll this back to a state that I think reflects more of what you were expecting wrt "simple context manager without retry logic". More than happy to get that done.

The test cases also include examples of how a user would have to deal with retrying transaction failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transaction API
10 participants