Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't read a branch II #1138

Open
ivukotic opened this issue Feb 22, 2024 · 2 comments
Open

Can't read a branch II #1138

ivukotic opened this issue Feb 22, 2024 · 2 comments
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged

Comments

@ivukotic
Copy link

here very simple code showing the issue:

import uproot
import awkward as ak
print(uproot.__version__)
print(ak.__version__)
fn='root://xcache.af.uchicago.edu:1094//'
fn+='root://dcgftp.usatlas.bnl.gov:1094//pnfs/usatlas.bnl.gov/LOCALGROUPDISK/rucio/data18_13TeV/04/9a/DAOD_PHYSLITE.34857549._000001.pool.root.1'
with uproot.open(fn) as f:
    tree=f['CollectionTree;1']
    a=tree["METAssoc_AnalysisMETAux./METAssoc_AnalysisMETAux.jetLink"].array()

here output:

5.3.0rc2.dev4+g8386b3e
2.6.1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /analysis/uproot5/src/uproot/interpretation/objects.py:831, in AsStridedObjects.basket_array(self, data, byte_offsets, basket, branch, context, cursor_offset, library, options)
    830 try:
--> 831     output = data.view(dtype).reshape((-1, *shape))
    833 except ValueError as err:

ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[11], line 9
      7 with uproot.open(fn) as f:
      8     tree=f['CollectionTree;1']
----> 9     a=tree["METAssoc_AnalysisMETAux./METAssoc_AnalysisMETAux.jetLink"].array()

File /analysis/uproot5/src/uproot/behaviors/TBranch.py:1815, in TBranch.array(self, interpretation, entry_start, entry_stop, decompression_executor, interpretation_executor, array_cache, library, ak_add_doc)
   1812                 ranges_or_baskets.append((branch, basket_num, range_or_basket))
   1814 interp_options = {"ak_add_doc": ak_add_doc}
-> 1815 _ranges_or_baskets_to_arrays(
   1816     self,
   1817     ranges_or_baskets,
   1818     branchid_interpretation,
   1819     entry_start,
   1820     entry_stop,
   1821     decompression_executor,
   1822     interpretation_executor,
   1823     library,
   1824     arrays,
   1825     False,
   1826     interp_options,
   1827 )
   1829 _fix_asgrouped(
   1830     arrays,
   1831     expression_context,
   (...)
   1835     ak_add_doc,
   1836 )
   1838 if array_cache is not None:

File /analysis/uproot5/src/uproot/behaviors/TBranch.py:3143, in _ranges_or_baskets_to_arrays(hasbranches, ranges_or_baskets, branchid_interpretation, entry_start, entry_stop, decompression_executor, interpretation_executor, library, arrays, update_ranges_or_baskets, interp_options)
   3140     pass
   3142 elif isinstance(obj, tuple) and len(obj) == 3:
-> 3143     uproot.source.futures.delayed_raise(*obj)
   3145 else:
   3146     raise AssertionError(obj)

File /analysis/uproot5/src/uproot/source/futures.py:38, in delayed_raise(exception_class, exception_value, traceback)
     34 def delayed_raise(exception_class, exception_value, traceback):
     35     """
     36     Raise an exception from a background thread on the main thread.
     37     """
---> 38     raise exception_value.with_traceback(traceback)

File /analysis/uproot5/src/uproot/behaviors/TBranch.py:3085, in _ranges_or_baskets_to_arrays.<locals>.basket_to_array(basket)
   3082 context = dict(branch.context)
   3083 context["forth"] = forth_context[branch.cache_key]
-> 3085 basket_arrays[basket.basket_num] = interpretation.basket_array(
   3086     basket.data,
   3087     basket.byte_offsets,
   3088     basket,
   3089     branch,
   3090     context,
   3091     basket.member("fKeylen"),
   3092     library,
   3093     interp_options,
   3094 )
   3095 if basket.num_entries != len(basket_arrays[basket.basket_num]):
   3096     raise ValueError(
   3097         """basket {} in tree/branch {} has the wrong number of entries """
   3098         """(expected {}, obtained {}) when interpreted as {}
   (...)
   3106         )
   3107     )

File /analysis/uproot5/src/uproot/interpretation/jagged.py:196, in AsJagged.basket_array(self, data, byte_offsets, basket, branch, context, cursor_offset, library, options)
    193 mask[header_idxs] = False
    194 data = data[mask]
--> 196 content = self._content.basket_array(
    197     data, None, basket, branch, context, cursor_offset, library, options
    198 )
    200 byte_counts = byte_stops - byte_starts
    201 counts = fast_divide(byte_counts, self._content.itemsize)

File /analysis/uproot5/src/uproot/interpretation/objects.py:834, in AsStridedObjects.basket_array(self, data, byte_offsets, basket, branch, context, cursor_offset, library, options)
    831             output = data.view(dtype).reshape((-1, *shape))
    833         except ValueError as err:
--> 834             raise ValueError(
    835                 """basket {} in tree/branch {} has the wrong number of bytes ({}) """
    836                 """for interpretation {}
    837 in file {}""".format(
    838                     basket.basket_num,
    839                     branch.object_path,
    840                     len(data),
    841                     self,
    842                     branch.file.file_path,
    843                 )
    844             ) from err
    845         self.hook_after_basket_array(
    846             data=data,
    847             byte_offsets=byte_offsets,
   (...)
    854             options=options,
    855         )
    856         return output

ValueError: basket 24 in tree/branch /CollectionTree;1:METAssoc_AnalysisMETAux./METAssoc_AnalysisMETAux.jetLink has the wrong number of bytes (6358) for interpretation AsStridedObjects(Model_ElementLink_3c_DataVector_3c_xAOD_3a3a_Jet_5f_v1_3e3e__v1)
in file root://xcache.af.uchicago.edu:1094//root://dcgftp.usatlas.bnl.gov:1094//pnfs/usatlas.bnl.gov/LOCALGROUPDISK/rucio/data18_13TeV/04/9a/DAOD_PHYSLITE.34857549._000001.pool.root.1
@ivukotic ivukotic added the bug (unverified) The problem described would be a bug, but needs to be triaged label Feb 22, 2024
@alexander-held
Copy link
Member

Hi, just to cross-reference things: this might be the same generic issue we had regarding reading PHYSLITE files that can be circumvented with a feature in coffea to drop unreadable branches? usatlas/analysisbase-dask#4 (comment)

@ivukotic
Copy link
Author

Just to document how these branches have been written: https://gitlab.cern.ch/atlas/athena/-/merge_requests/61371
So DAOD_PHYSLITE should have split level 1.

@jpivarski jpivarski added this to Important in Finalization Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged
Projects
Finalization
Deserialization
Development

No branches or pull requests

2 participants