Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help prevent inconsistencies with select...for update row locks #190

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

GeeWee
Copy link
Contributor

@GeeWee GeeWee commented Mar 12, 2019

Partially fixes #184

This takes the steps outlined in #184 to prevent deadlocks and data inconsistencies.
It locks rows before updating them, on databases that support this. Unfortunately since the testing suite uses SQLite I haven't been able to write a test for this, and I was not comfortable changing the testing DB.

I've also made a change so that to() always fetches the newest value from the database, which is necessary because otherwise we can end up in situations where there's inconsistencies with concurrent connections. I've added a test for this that was failing before the changes, but is passing now.
I haven't gone through and checked whether or not other methods than to should have the same optimization applied.

@GeeWee GeeWee mentioned this pull request Mar 12, 2019
@GeeWee
Copy link
Contributor Author

GeeWee commented Mar 12, 2019

I've tried out my own branch on a project where there's about 70~ models with order that needs updating on every drag, and have not noticed a performance detriment with the select for update statement.

@gabn88
Copy link

gabn88 commented May 2, 2020

I have borrowed some of your insights for concurrency and preventing race conditions, especially the len(qs) on the get_ordering_queryset is a life saver. Thanks a lot!

(I was down the road with a LOCK TABLE raw SQL statement to lock the whole table, but this made the whole application crash in production with concurrent requests)

@GeeWee
Copy link
Contributor Author

GeeWee commented May 3, 2020

Glad it helped!

@gabn88
Copy link

gabn88 commented May 7, 2020

@GeeWee
In our production server the len(qs) takes approx 1 second, because that is the time it takes to do a count (4,5 milion rows, indexed). Is there a way to set the select_for_update without a count that you know of?

@imomaliev
Copy link
Collaborator

@gabn88 try qs.exists()

@GeeWee
Copy link
Contributor Author

GeeWee commented May 7, 2020

Just want to emphasize that it's been a little while since I wrote this. I was able to still create some race conditions with this setup, if I remember correctly, but it was much, much harder and much less frequent.

@imomaliev
Copy link
Collaborator

I was busy with work. I will have time to look into this in couple of days. Could you rebase, please?

@GeeWee
Copy link
Contributor Author

GeeWee commented May 10, 2020

I'm sorry I don't actually use django-ordered-model anymore, so I don't think I'll have time to look at this in the foreseeable future. Edits by maintainers are enabled though, so you should be able to rebase it if you have the time.

@shuckc
Copy link
Member

shuckc commented Apr 15, 2021

I've rebased and verified tests. I agree using select_for_update to take row-level locks over the currently selected order_with_respect_to partition looks like a plausible way to go to avoid race conditions without the cost of escalating to a table lock. It would be reassuring if we could test that SELECT .... FOR UPDATE SQL actually gets emitted.

It looks like current_order_value_in_db is refreshed with those locks held, so probably this is also a good idea and worth the extra query in the uncontended case.

Perhaps there is a chance we could see the simultaneous insert of a new row (from create/save operation) that sneaks into the same partition above our range, but would not be locked, and would then miss the increase resulting in an overlap. I think we probably should audit how some of these queries could run concurrently before merging. However the patch is at least current and available for anyone who wants it.

@shuckc shuckc changed the title Helps prevent deadlocks Help prevent inconsistencies by using select ... for update row locks Apr 15, 2021
@shuckc shuckc changed the title Help prevent inconsistencies by using select ... for update row locks Help prevent inconsistencies by using select...for update row locks Apr 15, 2021
@shuckc shuckc changed the title Help prevent inconsistencies by using select...for update row locks Help prevent inconsistencies with select...for update row locks Apr 15, 2021
@shuckc
Copy link
Member

shuckc commented Apr 15, 2021

If we refactor the lock-taking to a sub or wrapper and use it across save, delete, swap, to, above and below methods I think I'd be happy merging it.

Comment on lines +272 to +276
current_order_value_in_db = (
self.get_ordering_queryset()
.filter(pk=self.pk)
.values(self.order_field_name)
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may still have other stale data in self that is not updated, possibly overwriting other changes. So either the final self.save() call should use update_fields, or you could call self.refresh_from_db() here instead of doing this query. The latter form updates all the data after acquiring the lock and should bring the instance up to date again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race conditions
5 participants