Questions tagged [awkward-array]

Python toolkit for manipulating nested data structures as though they were NumPy arrays.

awkward is a Python library for computing array-at-a-time ("vectorized") operations on nested and irregular-length data structures. The interface resembles as much as possible and is implemented using NumPy.

It is intended as an interactive analysis toolkit for datasets that can't be reduced to rectilinear arrays. For example, a dataset of extrasolar planets can have arbitrarily many planets per star, and each planet has several attributes. A dataset of particle physics collisions contains collision event records, each with arbitrarily many electron records, muon records, photon records, etc. Instead of looping over these constructs in a general purpose language, awkward-array allows the user to slice them like (irregular) multidimensional arrays, project through columns, sum over variable-sized sets, etc.

As an artificial example, consider this structure:

complicated = awkward.fromiter(
    [[1.21, 4.84, None, 10.89, None],
     [19.36, [30.25]],
     [{"x": 36, "y": {"z": 49}}, None, {"x": 64, "y": {"z": 81}}]
    ])

Once in awkward form, we can apply Numpy operations, such as ufuncs:

numpy.sqrt(complicated).tolist()
# [[1.1, 2.2, None, 3.3000000000000003, None],
#  [4.4, [5.5]],
#  [{'x': 6.0, 'y': {'z': 7.0}}, None, {'x': 8.0, 'y': {'z': 9.0}}]]

awkward-array interfaces with , , , , and .

68 questions
1
vote
1 answer

ak.add function similar to np.add

Do we already have a function similar to np.add in awkward arrays? I am in a situation i need to add them, and "+" operator work fine for simple array but not for nested array. e.g. >>> ak.to_list(c1) [[], [], [], [], [0.944607075944902]] >>>…
Raman Khurana
  • 125
  • 1
  • 7
1
vote
1 answer

Plotting with different length of jagged arrays

I have a problem when trying to plot 2d histogram or graph with different length of jagged arrays. Here is a simple example. Suppose there are 7 events of gen-level pT and its Et. pT = [ [46.8], [31.7], [21], [29.9], [13.9], [41.2], [15.7] ] Et = […
Jongho Lee
  • 13
  • 2
1
vote
0 answers

How to resolve TypingErrors when the property of an ak.Array subclass relies on Numba-compiled functions?

I've perused the docs and demo notebooks in the awkward1 repo (and it's entirely possible I missed something obvious), but I've strayed into unfamiliar territory and would like some help. Suppose I have points in a polar coordinate frame that I…
1
vote
0 answers

Appending columns to awkward.Table

I am trying to append new columns to an existing awkward Table. The README said that this is possible, but I can't find the way. For example I have a table df = awk.Table( p = ppkf2, th = thetakf2, ph…
AlexBaykov
  • 33
  • 3
1
vote
1 answer

how to copy a jagged array in awkward-array

In awkward0 I would like to separately persist various selections of a table in pseudo code X = awkward.Table(...) one_jet = X[X.n_jet == 1] two_jet = X[X.n_jet == 1] awkward.save(one_jet) awkward.save(two_jet) but I notices the contents of any…
Lukas Heinrich
  • 223
  • 2
  • 6
1
vote
0 answers

Is there a way to stack JaggedArrays without fromiter()?

I have two 2D JaggedArrays of the same length (in axis=0) that I'd like to stack in axis=1. For example: a = awkward.fromiter([[0], [3, 4]]) b = awkward.fromiter([[1, 2], [5]]) and I want to get this JaggedArray: [ [ [0], [1, 2] ], [ [3, 4],…
mason
  • 11
  • 3
1
vote
2 answers

How to avoid "Too many open files" error when using uproot.daskframes to create daskframe from many ROOT files

I wanted to try using uproot to read a number of root files with flat ROOT NTupels into a desk frame. 214 files, 500kb each, about 8000 rows and 16 columns/variables in each. They easily fit in a pandas data frame in memory, but I am trying to learn…
Michael E.
  • 128
  • 1
  • 7
1
vote
1 answer

Propagating a selection on a subset of awkward-array back up

Is there a better way to do this logic? I want to propagate a selection from a lower-level selection available only on a subset of inner elements upwards Specifically, I am looking to have an event level cut for oppositely charged muon-electron…
Andrzej Novák
  • 136
  • 1
  • 8
1
vote
1 answer

Content, starts and stops of ChunkedArray - built from lazyarray

I have some code that works fine for JaggedArrays extracting content, starts, stops, but I would like to run the same code on some ChunkedArrays, obtained from lazyarrays from uproot. Unfortunately, I obtained the following…
1
vote
1 answer

Arrow ListArray from pandas has very different structure from arrow array generated by awkward?

I encountered the following issue making some tests to demonstrate the usefulness of a pure pyarrow UDF in pyspark as compared to always going through pandas. import awkward import numpy import pandas import pyarrow counts =…
1
vote
1 answer

Awkward Array: Fancy indexing with boolean mask along named axis

I have a dataset of 2D audio data. These audio fragments differ in length, hence I'm using Awkward Array. Through a Boolean mask, I want to only return the parts containing speech. Table mask attempt import numpy as np import awkward as aw awk =…
NumesSanguis
  • 5,832
  • 6
  • 41
  • 76
1
vote
1 answer

May I see a short example of cutting on data to prepare it for histogramming in uproot?

I have am using Python 3.6.5 (or higher) and I have successfully installed 'numpy', 'uproot', and 'awkward'. I have a previously made *.root file with a jagged NTuple which contains quite a large number of branches. This is particle physics data and…
1
vote
1 answer

Can JaggedArray counts innermost layer and return another JaggeredArray?

so_jaggered = awkward.fromiter([[[0, 1, 2]], [[0, 1], [2, 3]], [[0, 1, 2], [3, 4]]]) so_jaggered.counts Current version 0.12.13 returns array([1, 2, 2]) However, I want to count only the innermost part, which can be achieved by following…
1
vote
1 answer

How to combine two uproot jagged arrays into one

I am using uproot with awkward-array and have two jagged arrays containing the list of electrons per event and muons per event. How can I combine these to get the list of leptons per event i.e. concatenate the inner axis E.g. I have
1
vote
1 answer

Efficiently sorting and filtering a JaggedArray by another one

I have a JaggedArray (awkward.array.jagged.JaggedArray) that contains indices that point to positions in another JaggedArray. Both arrays have the same length, but each of the numpy.ndarrays that the JaggedArrays contain can be of different length.…
Clemens
  • 13
  • 3