Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort document fields in lexicographic/numeric order #2038

Open
AlekSi opened this issue Feb 22, 2023 · 2 comments · Fixed by #4223
Open

Sort document fields in lexicographic/numeric order #2038

AlekSi opened this issue Feb 22, 2023 · 2 comments · Fixed by #4223
Assignees
Labels
code/enhancement Some user-visible feature could work better community Issues and PRs assigned to community members
Milestone

Comments

@AlekSi
Copy link
Member

AlekSi commented Feb 22, 2023

What should be done?

https://www.mongodb.com/docs/manual/reference/operator/update/#behavior

  • "bar" comes before "foo"
  • "7" comes before "42"
  • "bar.7" comes before "bar.42"

Where?

func (d *Document) SortFieldsByKey() *Document {

Definition of Done

  • SortFieldsByKey added to places where it should be;
  • all handlers updated;
  • unit tests added/updated;
  • integration/compatibility tests added/updated;
  • spot refactorings done;
  • user documentation updated.
@AlekSi AlekSi added the code/chore Code maintenance improvements label Feb 22, 2023
@AlekSi AlekSi added the not ready Issues that are not ready to be worked on; PRs that should skip CI label May 1, 2023
@wazir-ahmed
Copy link
Contributor

wazir-ahmed commented Jan 4, 2024

@AlekSi, I faced this issue while working for PR 3931. We might need more than SortFieldsByKey() to solve this problem. Here is why,

  1. In MongoDB, not only each update operator process the fields in lexical/numeric order, the fields across different operators are processed in sorted order. For example,
db.t1.updateOne({_id: 1, y: 3, x: 2}, {$set: {b: 2, d: 4}, $inc: {a: 1}, $push: {e: 5}, $setOnInsert: {c: 3}}, {upsert: true});
// result
[
  {
    // filter fields are added as it was passed in the command without any ordering
    _id: 1,
    x: 2,
    y: 3,
    // fields mentioned in update operators are sorted and processed one by one.
    a: 1,
    b: 2,
    c: 3,
    d: 4,
    e: [ 5 ]
  }
]
  1. In case of nested fields, I see different behaviours with usage of dot notation and not using it.
db.t1.updateOne({_id: 2}, {$set: {"a.y": 2}, $inc: {"a.x": 1}, $setOnInsert: {"a.z": 3}}, {upsert: true});
// result
[
  {
    _id: 2,
    // fields with dot notations are sorted and processed (similar to previous case)
    a: { x: 1, y: 2, z: 3 }
  }
]

db.t1.updateOne({_id: 3}, {$set: {x: { c: 3, b: 2}}, $setOnInsert: {a: 1}}, {upsert: true});
// result
[
  {
    _id: 3,
    // sorting is only done for fields at root level
    // if the operator has as a field whose value is a document (in our case of `x`), then the value is treated as it is, sorting is not carried out with fields inside that document (value).
    a: 1,
    x: { c: 3, b: 2 }
  }
]
  1. In case of update, if operators has a mix of existing fields and new fields, then existing fields retains their position in the document. The new fields which are getting added to the document are sorted and append to the document.
db.t1.updateOne({_id: 3}, {$set: {"x.b": 4, "x.c": 5, "x.d": 6, "x.a": 3}});
// result
[
  {
    _id: 3,
    a: 1,
    x: {
      // existing nested fields x.c and x.b retained their position
      c: 5,
      b: 4,
      // x.a and x.d are sorted and appended
      a: 3,
      d: 6
    }
  }
]

Solution

  • In UpdateDocument(), instead of iterating through the operators and processing the fields and values in each operator as it gets scanned from the input,
    1. we should create a list of {field, operator, value} (taking all operators into consideration, at once)
    2. sort the items in the list based on field
    3. loop through the sorted list and process it one by one.

That would solve the above example scenarios.

wazir-ahmed added a commit to wazir-ahmed/FerretDB that referenced this issue Apr 5, 2024
wazir-ahmed added a commit to wazir-ahmed/FerretDB that referenced this issue Apr 8, 2024
wazir-ahmed added a commit to wazir-ahmed/FerretDB that referenced this issue Apr 8, 2024
wazir-ahmed added a commit to wazir-ahmed/FerretDB that referenced this issue Apr 10, 2024
@rumyantseva rumyantseva added the community Issues and PRs assigned to community members label Apr 11, 2024
@AlekSi AlekSi self-assigned this Apr 15, 2024
@AlekSi AlekSi added this to the v1.22.0 milestone Apr 15, 2024
@AlekSi AlekSi removed the not ready Issues that are not ready to be worked on; PRs that should skip CI label May 1, 2024
@AlekSi AlekSi added code/enhancement Some user-visible feature could work better and removed code/chore Code maintenance improvements labels May 15, 2024
@AlekSi AlekSi reopened this May 15, 2024
@AlekSi
Copy link
Member Author

AlekSi commented May 15, 2024

This issue is linked from internal/types/document_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code/enhancement Some user-visible feature could work better community Issues and PRs assigned to community members
Projects
Status: In progress
Development

Successfully merging a pull request may close this issue.

4 participants