I am using Uproot to access a Root Tree in Python and I am noticing a significant slowdown when I try to access one particular branch: wf, which contains an array of jagged arrays
I am accessing the branches by using the Lazy/Awkward method and I am using the step_size option.
LazyFileWF = uproot.lazy('../Layers9_Xe_Phantom102_run1.root:dstree;111', filter_name= "wf",step_size=100)
I experience a 6 to 10 second slow down when I want to access an entry in "LazyFileWF" but if I move on to the next consecutive entry, it only takes about 14 ms up until the end of the step_size. However my script needs to select entries randomly, not sequentially, which means every entry will take me about 8 seconds to access. I am able to access data from the other branches fairly quickly with the exception of this one and I wanted to find out why.
By using uproot.open()
and then .show()
I noticed that the interpretation of the branch was being labeled as AsObjects(AsObjects(AsVector(True, AsVector(False, dtype('>f4'))))
I did some digging in the Documentation and found this:
It mentions I can use simplify
to improve the slow deserialization.
So here's what I would like to know, based on the Root Tree I have, can I use simplify
to reduce the 8 second slowdown to access my branch? And if so how can implement it? Is there a better way to read this branch?
I tried:
a = uproot.AsObjects.simplify(LazyFileWF.wf)
a
but I got an error telling me
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_147439/260639244.py in <module>
5 LazyFileWF = uproot.lazy('../Layers9_Xe_Phantom102_run1.root:dstree;111', filter_name= "wf",step_size=100)
6 events.show(typename_width=35, interpretation_width= 60)
----> 7 a = uproot.AsObjects.simplify(LazyFileWF.wf)
8 a
~/anaconda3/envs/rapids-21.10/lib/python3.7/site-packages/uproot/interpretation/objects.py in simplify(self)
245 ``self``.
246 """
--> 247 if self._branch is not None:
248 try:
249 return self._model.strided_interpretation(
~/anaconda3/envs/rapids-21.10/lib/python3.7/site-packages/awkward/highlevel.py in __getattr__(self, where)
1129 raise AttributeError(
1130 "no field named {0}".format(repr(where))
-> 1131 + ak._util.exception_suffix(__file__)
1132 )
1133
AttributeError: no field named '_branch'
(https://github.com/scikit-hep/awkward-1.0/blob/1.7.0/src/awkward/highlevel.py#L1131)