Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve adding of object fields in SelectionResult objects #351

Open
pkausw opened this issue Nov 13, 2023 · 2 comments · May be fixed by #354
Open

Improve adding of object fields in SelectionResult objects #351

pkausw opened this issue Nov 13, 2023 · 2 comments · May be fixed by #354
Assignees
Labels
bug Something isn't working priority-medium Medium priority stuff

Comments

@pkausw
Copy link
Member

pkausw commented Nov 13, 2023

While writing unit tests for the SelectionResult class, I came upon this feature of the current implementation. Consider these two object fields:

result1 = {
  "objects": {
      "Jet": {
          "Jet": ak.Array([[0, 1], [2, 3], [0], [0]]),
      },
  },
}

result2 = {
  "objects": {
    "Muon": {
      "Muon": ak.Array([[0, 1], [2, 3], [0], [0]]),
      },
     "Jet": {
       "Jet": ak.Array([[0, 3], [2, 3], [0, 1], [0]]),
       "selectedJet": ak.Array([[0, 1], [2, 3], [0], [0]])
     } 
  },
}

After adding two SelectionResult instances derived from the dictionaries above (called bar), I see this

In [54]: bar = selection_result1 + selection_result2
In [55]: bar.objects
Out[55]: 
DotDict([('Jet',
          DotDict([('Jet',
                    <Array [[0, 3], [2, 3], [0, 1], [0]] type='4 * var * int64'>),
                   ('selectedJet',
                    <Array [[0, 1], [2, 3], [0], [0]] type='4 * var * int64'>)])),
         ('Muon',
          DotDict([('Muon',
                    <Array [[0, 1], [2, 3], [0], [0]] type='4 * var * int64'>)]))])

As you can see, the Jet/Jet section is simply updated (and thus overwritten) in the add operation. Imho, this is not the expected result - if two selections both produce the same collection (in this case Jet/Jet), the logical thing would be to select objects that survive both selections (i.e. do an AND between the indices).

Of course, we might also want to avoid this behavior altogether, since defining the same collection of objects in different selectors might cause confusion. On the other hand, this would limit the usage of the selectors - maybe a user wants to update a given collection of objects based on some other criteria.

Imho, there are two way we could go about this:

  • Supress the definition of the same field in different SelectionResults, i.e. raise some kind of Error
  • update the add operation to do a proper AND of the indices.

What do you think about this?

@pkausw pkausw added bug Something isn't working priority-medium Medium priority stuff labels Nov 13, 2023
@pkausw pkausw self-assigned this Nov 13, 2023
@riga
Copy link
Member

riga commented Nov 13, 2023

Unexpected behavior is definitely the worst enemy of potential framework users.

I think there is no automatic way to decide which of the indices to use, and also the AND of two sets of indices is undefined, i.e., while there might be an easy way to select only the intersection of two sets of indices, the order is important and we cannot infer a merged order. I think we should raise in these cases.

@pkausw
Copy link
Member Author

pkausw commented Nov 13, 2023

Thanks for the feedback @riga ! I'll implement something and add a unit test for this case as well 👍 We should also add this to the docs accordingly as soon as it's merged.

@pkausw pkausw linked a pull request Nov 15, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority-medium Medium priority stuff
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

2 participants