-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue when accessing objects by index with latest Mongo versions #2748
Comments
Weird indeed, I'll have to look into it but in the meantime, note that you can cast the queryset into a list as a workaround
|
Just looked into this and issue is not in MongoEngine, it is in Pymongo. When using the index access, MongoEngine relies on PyMongo cursor behavior. In this particular snippet you are sorting by a field that doesn't exist on the documents and this seems to be causing the inconsistencies, see below. It's unusual for a cursor to be iterated by the index. I understand it's an odd behavior but usually you just fire the query, use skip & limit and then just iterate in the results. @ShaneHarvey Could you elaborate on this? Is it an expected behavior? |
@ShaneHarvey do you have any idea / is it a known behavior? |
I agree that this looks like a bug in the server but it's possible it could be a known behavior change. I've reported it here and am waiting for their response: https://jira.mongodb.org/browse/SERVER-87430 Here's my repro using only pymongo, not mongoengine: from pymongo import MongoClient
client = MongoClient()
coll = client.test.test
version = client.server_info()['version']
print(f'MongoDB version: {version}')
coll.drop()
coll.insert_many([{"_id": i} for i in range(20)])
print('Find docs with a single query:')
print([doc["_id"] for doc in coll.find(sort={'missing': 1})])
print('Find docs with the same query with skip+limit:')
docs = []
for i in range(20):
docs.append(coll.find_one(sort={'missing': 1}, skip=i))
print([doc["_id"] for doc in docs])
print('Find docs using aggregation with skip+limit:')
docs = []
for i in range(20):
docs.append(list(coll.aggregate([{"$sort": {'missing': 1}}, {"$skip": i}, {"$limit": 1}]))[0])
print([doc["_id"] for doc in docs]) On MongoDB <=4.4:
On MongoDB 4.4+
|
I've done a little more investigation and found that, unfortunately, this was an intentional change in MongoDB 4.4 (see SERVER-51498). The behavior is documented here:
https://www.mongodb.com/docs/manual/reference/operator/aggregation/sort/#sort-consistency Taking the advice in the docs you would need to add "_id" to the sort, like this: users = User.objects.order_by('foo', '_id') |
Oh indeed! Thank you a lot for taking time to look into this issue! Maybe it would be worth adding a note on MongoEngine queryset doc, but else I think my issue has been answered. Thank you both again. |
I may be misunderstanding but in this case the issue is not that the order of documents returned is not predictable when there are duplicates values on the sorted field. (Unless you are saying this may occur within the same cursor instance) The same instance of the cursor is returning the same documents multiple times when we use the index access on a cursor i.e cursor[j] (and collection isn't being altered while we iterate). Indeed sorting on _id will fix it but current user experience is quite unexpected (It's rather odd to use the index access so I m not necessarily worried or looking for a fix but I wanted to make sure we were aligned on the observation) |
Hello !
We've upgraded to the latest mongo, pymongo and mongoengine versions recently.
It seems that we sometimes have duplicates results when paginating using slices or index on objects with an order_by clause.
FYI, we use the pagination from flask-mongoengine, that uses slices to build page items.
It seems that accessing an object by slice (or index) is not consistent when the order_by clause is applied on non-existent (or sometimes empty) field.
Here is a small script to show the index iteration issue, by counting the number of occurrences of the same object when iterating.
Results show that using index, the same elements are returned multiple times instead of being returned once only (as in the for loop case).
Tested with mongoengine 0.20.0 and 0.27.0.
It seemed to return inconsistent pagination results since Mongo 4.4 only.
Thank you for your support.
Please, let me know if this is not a mongoengine issue or if I'm using something incorrectly.
The text was updated successfully, but these errors were encountered: