Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Form manipulation issues with uproot 5.3.3 #1080

Open
nsmith- opened this issue Apr 15, 2024 · 8 comments · Fixed by scikit-hep/uproot5#1209
Open

Form manipulation issues with uproot 5.3.3 #1080

nsmith- opened this issue Apr 15, 2024 · 8 comments · Fixed by scikit-hep/uproot5#1209
Labels
bug Something isn't working

Comments

@nsmith-
Copy link
Member

nsmith- commented Apr 15, 2024

As found in #1076 with uproot 5.3.3 we have a change in the form passed to the form_mapping callable argument in uproot.dask:
with v5.3.2 on opening tests/samples/treemaker.root the list of top-level fields in the record array include:

        "Electrons",
        "Electrons/Electrons.fCoordinates.fPt",
        "Electrons/Electrons.fCoordinates.fEta",
        "Electrons/Electrons.fCoordinates.fPhi",
        "Electrons/Electrons.fCoordinates.fE",
        "Electrons_charge",
        "Electrons_iso",
        "Electrons_mediumID",
        "Electrons_MTW",
        "Electrons_passIso",
        "Electrons_tightID",

and in 5.3.3 we have

        "Electrons",
        "Electrons_charge",
        "Electrons_iso",
        "Electrons_mediumID",
        "Electrons_MTW",
        "Electrons_passIso",
        "Electrons_tightID",

it appears that the individual split sub-branches are no longer listed and only the top-level Electrons branch is provided. Similar issues are present for other branches. The current NanoEvents code parses the split branches to build the schema. This sounds like it may have been due to scikit-hep/uproot5#1189

@nsmith- nsmith- added the bug Something isn't working label Apr 15, 2024
@nsmith-
Copy link
Member Author

nsmith- commented Apr 15, 2024

If I revert scikit-hep/uproot5@f137305 then the coffea tests pass and both the main branch and its subbranches are listed again

@nsmith-
Copy link
Member Author

nsmith- commented Apr 15, 2024

I am not sure if the desired behavior is to consider the top-level branch and it's sub-branches as "duplicate" or not. I would prefer to only label things duplicate when they truly have the exact same fully-qualified name and point to different data. But open to other opinions. We can either fix this in uproot or rework coffea NanoEvents to build from the record array Electrons branch (and similar for others), assuming it is backwards compatible.
FYI @jpivarski @ioanaif

@jpivarski
Copy link
Contributor

Unfortunately, we've been presenting nested branch data as a flat RecordArray of fields for a long time, in arrays and iterate before dask. A RecordArray can have multiple fields with the same names, but it's my understanding that it breaks dask-awkward.

Could we add a disambiguating suffix (e.g. _0, _1, _2) to uproot.dask field names that would have been the same? If this is only an issue after the point when the form-remapping would take effect (what happened before the duplicates-fix?), maybe we can only do it in the case of no form-remapping?

It sounds like the duplicates-removed code state is affecting more people than the with-duplicates code. If so, I should revoke it, release a new version of Uproot, and figure out the right policy afterward.

@lgray
Copy link
Collaborator

lgray commented May 1, 2024

Is that release, with the change reverted, going to be made?

@matthewfeickert
Copy link
Contributor

This is still an issue in uproot v5.3.4:

$ uv pip install --upgrade -e . 'uproot<=5.3.4' && pytest --cov-report=xml --cov=coffea --deselect=test_taskvine tests/test_nanoevents_delphes.py -k test_listify
   Built file:///home/feickert/Code/GitHub/scikit-hep/coffea                                                                                                                                                Built 1 editable in 419ms
Resolved 64 packages in 34ms
Installed 2 packages in 2ms
 - coffea==2024.4.2.dev13+gb944b41c (from file:///home/feickert/Code/GitHub/scikit-hep/coffea)
 + coffea==2024.4.2.dev13+gb944b41c (from file:///home/feickert/Code/GitHub/scikit-hep/coffea)
 - uproot==5.3.2
 + uproot==5.3.4
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.12.1, pytest-8.2.0, pluggy-1.5.0 -- /home/feickert/.pyenv/versions/3.12.1/envs/coffea-dev/bin/python
cachedir: .pytest_cache
Matplotlib: 3.8.4
Freetype: 2.6.1
rootdir: /home/feickert/Code/GitHub/scikit-hep/coffea
configfile: pyproject.toml
plugins: mpl-0.17.0, cov-5.0.0, typeguard-4.2.1, asyncio-0.23.6, mock-3.14.0
asyncio: mode=Mode.STRICT
collected 70 items / 69 deselected / 1 selected                                                                                                                                                            

tests/test_nanoevents_delphes.py::test_listify ERROR             

@jpivarski
Copy link
Contributor

If scikit-hep/uproot5#1209 fixes the bug, I'll merge it and immediately make a release.

@matthewfeickert
Copy link
Contributor

If scikit-hep/uproot5#1209 fixes the bug, I'll merge it and immediately make a release.

Thanks @jpivarski! That looks like that does it.

$ uv pip install --upgrade git+https://github.com/scikit-hep/uproot5@jpivarski/fixing_ignore_duplicates
$ pytest --cov-report=xml --cov=coffea --deselect=test_taskvine tests/test_nanoevents_delphes.py
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.12.1, pytest-8.2.0, pluggy-1.5.0 -- /home/feickert/.pyenv/versions/3.12.1/envs/coffea-dev/bin/python
cachedir: .pytest_cache
Matplotlib: 3.8.4
Freetype: 2.6.1
rootdir: /home/feickert/Code/GitHub/scikit-hep/coffea
configfile: pyproject.toml
plugins: mpl-0.17.0, cov-5.0.0, typeguard-4.2.1, asyncio-0.23.6, mock-3.14.0
asyncio: mode=Mode.STRICT
collected 70 items                                                                                                                                                                                         

tests/test_nanoevents_delphes.py::test_listify PASSED                                                                                                                                                [  1%]
tests/test_nanoevents_delphes.py::test_collection_exists[CaloJet02] PASSED                                                                                                                           [  2%]
tests/test_nanoevents_delphes.py::test_collection_exists[CaloJet04] PASSED                                                                                                                           [  4%]
tests/test_nanoevents_delphes.py::test_collection_exists[CaloJet08] PASSED                                                                                                                           [  5%]
tests/test_nanoevents_delphes.py::test_collection_exists[CaloJet15] PASSED                                                                                                                           [  7%]
tests/test_nanoevents_delphes.py::test_collection_exists[EFlowNeutralHadron] PASSED                                                                                                                  [  8%]
tests/test_nanoevents_delphes.py::test_collection_exists[EFlowPhoton] PASSED                                                                                                                         [ 10%]
tests/test_nanoevents_delphes.py::test_collection_exists[EFlowTrack] PASSED                                                                                                                          [ 11%]
tests/test_nanoevents_delphes.py::test_collection_exists[Electron] PASSED                                                                                                                            [ 12%]
tests/test_nanoevents_delphes.py::test_collection_exists[Event] PASSED                                                                                                                               [ 14%]
tests/test_nanoevents_delphes.py::test_collection_exists[EventLHEF] PASSED                                                                                                                           [ 15%]
tests/test_nanoevents_delphes.py::test_collection_exists[GenJet] PASSED                                                                                                                              [ 17%]
tests/test_nanoevents_delphes.py::test_collection_exists[GenJet02] PASSED                                                                                                                            [ 18%]
tests/test_nanoevents_delphes.py::test_collection_exists[GenJet04] PASSED                                                                                                                            [ 20%]
tests/test_nanoevents_delphes.py::test_collection_exists[GenJet08] PASSED                                                                                                                            [ 21%]
tests/test_nanoevents_delphes.py::test_collection_exists[GenJet15] PASSED                                                                                                                            [ 22%]
tests/test_nanoevents_delphes.py::test_collection_exists[GenMissingET] PASSED                                                                                                                        [ 24%]
tests/test_nanoevents_delphes.py::test_collection_exists[Jet] PASSED                                                                                                                                 [ 25%]
tests/test_nanoevents_delphes.py::test_collection_exists[MissingET] PASSED                                                                                                                           [ 27%]
tests/test_nanoevents_delphes.py::test_collection_exists[Muon] PASSED                                                                                                                                [ 28%]
tests/test_nanoevents_delphes.py::test_collection_exists[Particle] PASSED                                                                                                                            [ 30%]
tests/test_nanoevents_delphes.py::test_collection_exists[ParticleFlowJet02] PASSED                                                                                                                   [ 31%]
tests/test_nanoevents_delphes.py::test_collection_exists[ParticleFlowJet04] PASSED                                                                                                                   [ 32%]
tests/test_nanoevents_delphes.py::test_collection_exists[ParticleFlowJet08] PASSED                                                                                                                   [ 34%]
tests/test_nanoevents_delphes.py::test_collection_exists[ParticleFlowJet15] PASSED                                                                                                                   [ 35%]
tests/test_nanoevents_delphes.py::test_collection_exists[Photon] PASSED                                                                                                                              [ 37%]
tests/test_nanoevents_delphes.py::test_collection_exists[ScalarHT] PASSED                                                                                                                            [ 38%]
tests/test_nanoevents_delphes.py::test_collection_exists[Tower] PASSED                                                                                                                               [ 40%]
tests/test_nanoevents_delphes.py::test_collection_exists[Track] PASSED                                                                                                                               [ 41%]
tests/test_nanoevents_delphes.py::test_collection_exists[TrackJet02] PASSED                                                                                                                          [ 42%]
tests/test_nanoevents_delphes.py::test_collection_exists[TrackJet04] PASSED                                                                                                                          [ 44%]
tests/test_nanoevents_delphes.py::test_collection_exists[TrackJet08] PASSED                                                                                                                          [ 45%]
tests/test_nanoevents_delphes.py::test_collection_exists[TrackJet15] PASSED                                                                                                                          [ 47%]
tests/test_nanoevents_delphes.py::test_collection_exists[WeightLHEF] PASSED                                                                                                                          [ 48%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[CaloJet02] PASSED                                                                                                                       [ 50%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[CaloJet04] PASSED                                                                                                                       [ 51%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[CaloJet08] PASSED                                                                                                                       [ 52%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[CaloJet15] PASSED                                                                                                                       [ 54%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[GenJet] PASSED                                                                                                                          [ 55%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[GenJet02] PASSED                                                                                                                        [ 57%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[GenJet04] PASSED                                                                                                                        [ 58%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[GenJet08] PASSED                                                                                                                        [ 60%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[GenJet15] PASSED                                                                                                                        [ 61%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[Jet] PASSED                                                                                                                             [ 62%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[ParticleFlowJet02] PASSED                                                                                                               [ 64%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[ParticleFlowJet04] PASSED                                                                                                               [ 65%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[ParticleFlowJet08] PASSED                                                                                                               [ 67%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[ParticleFlowJet15] PASSED                                                                                                               [ 68%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[TrackJet02] PASSED                                                                                                                      [ 70%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[TrackJet04] PASSED                                                                                                                      [ 71%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[TrackJet08] PASSED                                                                                                                      [ 72%]
tests/test_nanoevents_delphes.py::test_lorentz_vectorization[TrackJet15] PASSED                                                                                                                      [ 74%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[CaloJet02] PASSED                                                                                                                [ 75%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[CaloJet04] PASSED                                                                                                                [ 77%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[CaloJet08] PASSED                                                                                                                [ 78%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[CaloJet15] PASSED                                                                                                                [ 80%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[GenJet] PASSED                                                                                                                   [ 81%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[GenJet02] PASSED                                                                                                                 [ 82%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[GenJet04] PASSED                                                                                                                 [ 84%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[GenJet08] PASSED                                                                                                                 [ 85%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[GenJet15] PASSED                                                                                                                 [ 87%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[Jet] PASSED                                                                                                                      [ 88%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[ParticleFlowJet02] PASSED                                                                                                        [ 90%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[ParticleFlowJet04] PASSED                                                                                                        [ 91%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[ParticleFlowJet08] PASSED                                                                                                        [ 92%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[ParticleFlowJet15] PASSED                                                                                                        [ 94%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[TrackJet02] PASSED                                                                                                               [ 95%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[TrackJet04] PASSED                                                                                                               [ 97%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[TrackJet08] PASSED                                                                                                               [ 98%]
tests/test_nanoevents_delphes.py::test_nested_lorentz_vectorization[TrackJet15] PASSED 

Though I can run on CI to double check. I'll report on scikit-hep/uproot5#1209 once that passes.

@matthewfeickert
Copy link
Contributor

Yup, that did it: scikit-hep/uproot5#1209 (review)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants