Questions tagged [awkward-array]

Python toolkit for manipulating nested data structures as though they were NumPy arrays.

awkward is a Python library for computing array-at-a-time ("vectorized") operations on nested and irregular-length data structures. The interface resembles as much as possible and is implemented using NumPy.

It is intended as an interactive analysis toolkit for datasets that can't be reduced to rectilinear arrays. For example, a dataset of extrasolar planets can have arbitrarily many planets per star, and each planet has several attributes. A dataset of particle physics collisions contains collision event records, each with arbitrarily many electron records, muon records, photon records, etc. Instead of looping over these constructs in a general purpose language, awkward-array allows the user to slice them like (irregular) multidimensional arrays, project through columns, sum over variable-sized sets, etc.

As an artificial example, consider this structure:

complicated = awkward.fromiter(
    [[1.21, 4.84, None, 10.89, None],
     [19.36, [30.25]],
     [{"x": 36, "y": {"z": 49}}, None, {"x": 64, "y": {"z": 81}}]
    ])

Once in awkward form, we can apply Numpy operations, such as ufuncs:

numpy.sqrt(complicated).tolist()
# [[1.1, 2.2, None, 3.3000000000000003, None],
#  [4.4, [5.5]],
#  [{'x': 6.0, 'y': {'z': 7.0}}, None, {'x': 8.0, 'y': {'z': 9.0}}]]

awkward-array interfaces with , , , , and .

68 questions
2
votes
0 answers

Using Chunked array (Akwkard lib) for fancy indexing or masking

I am loading a root file with uproot.lazyarrays() which produces a Table. I compute a function of this table which returns a JaggedArray whose length is equal to the length of the table. This is in the form of a ChunkedArray, and I would like to use…
2
votes
2 answers

Python collection of different sized arrays (Jagged arrays), Dask?

I have multiple 1-D numpy arrays of different size representing audio data. Since they're different sizes (e.g (8200,), (13246,), (61581,)), I cannot stack them as 1 array with numpy. The size difference is too big to engage in 0-padding. I can keep…
NumesSanguis
  • 5,832
  • 6
  • 41
  • 76
2
votes
0 answers

Awkward arrays; choosing an item from each row according to a list of indices

So the challenge is this; given an awkward array with n rows and a list of n indices (i_1 to i_n), return a list containing element i_m of row_m for all rows. This could be done like; import awkward some_awkward_array =…
Clumsy cat
  • 289
  • 1
  • 12
1
vote
1 answer

To_parquet giving error: __arrow_array__() got an unexpected keyword argument 'type'

I'm reading a root file using uproot and converting parts of it into a DataFrame using the arrays method. This works fine, until I try to save to parquet using the to_parquet method on the dataframe. Sample code is given below. # First three lines…
Gozmit97
  • 15
  • 3
1
vote
0 answers

Using `awkward` array to keep track of two different combinations of particles

I'm building on a previous question about the best way to use awkward efficiently with combinations of particles. Suppose I have a final state of 4 muons and my hypothesis is that these come from some resonance or particle (Higgs?) that decays to…
Matt Bellis
  • 289
  • 2
  • 10
1
vote
2 answers

Adding first, second, etc element of an awkward array together

I am very new to awkward arrays (and python in general) and just want to know how I could add the first elements of each array in an ak.array together. E.g my_array = [[1,2,3], [4,5,6], [7,8,9].....] and I want 1 + 4 + 7...
1
vote
1 answer

How can I resolve the 'Basket data should be added to all branches' exception when writing to a TTree with Uproot?

Reading from an ntuple and saving into a new ntuple, after many iterations over events, getting the following: utils.writeTree(outf, ltree, outtreename, brs) File "/srv01/agrp/roybr/zprimeplusxntr/UprootFramework/utils.py", line 273, in…
Roy Brener
  • 11
  • 2
1
vote
1 answer

Creating a new Awkward array from indices

The problem I am facing is to create a new array from a set of indices. Namely, I have a set of particles and jets, for each jet there is a list of indices of which particles belong to the given jet. import awkward as ak import vector …
LauritsT
  • 57
  • 6
1
vote
1 answer

how to add new field in a 'zip' jagged array

I want to add a new field in an already zipped jagged array. For example, if I zip 4D info into a muons object, then I can call pt,eta,phi,charge like this: muons.Muon.pt. However, if I want to add a new field such as 2*pt into this muons object,…
Zhenxuan
  • 11
  • 1
1
vote
1 answer

Is there an analog of coffea.processor.PackedSelection() for jagged array masks?

So in a coffea processor, I've implemented a series of cuts on the object level using a dictionary of jagged truth arrays, where each item is just a cut; think cuts['etacut'] = abs(events.cscRechitClusterEta) > 1.9. And if I want to superimpose…
aaportel
  • 471
  • 1
  • 3
  • 6
1
vote
2 answers

filtering "events" in awkward-array

I am reading data from a file of "events". For each event, there is some number of "tracks". For each track there are a series of "variables". A stripped down version of the code (using awkward0 as awkward) looks like f =…
1
vote
1 answer

Using awkward-array with zip/unzip with two different physics objects

I'm trying to reproduce parts of the Higgs discovery in the Higgs --> 4 leptons channel with open data and making use of awkward. I can do it when the leptons are the same (e.g. 4 muons) with zip/unzip, but is there a way to do it in the 2 muon/2…
Matt Bellis
  • 289
  • 2
  • 10
1
vote
0 answers

Comparing arrays for equality with condition on one entry

Let's say I have a "Truth" array: a = [1,2,3] and I want to compare it to b for equality with b = [1,2,3] awkward.all(a == b) Easy enough. Now we want this to be a general code which works for all of our data. But for one of these cases, have a…
1
vote
1 answer

Using `dask` to fill `boost_histograms` stored in class in parallel

I have an dask -boost_histogram question. I have a code structure as follows: I have a class defined in some script: class MyHist: def __init__(....): self.bh = None def make_hist(...): axis = bh.axis.Regular(....) …
1
vote
1 answer

TLorentz vector features in uproot4/vector when calculating invariant mass of a jet

I wish to sum all the 4-momenta of the constituents in a jet. In uproot3 (+ uproot3-methods) there was the functionality of creating a TLorentzVectorArray and just doing .sum() So this worked fine: import uproot3 import akward0 as ak input_file =…
LauritsT
  • 57
  • 6