Can JaggedArray counts innermost layer and return another JaggeredArray?

Question

so_jaggered = awkward.fromiter([[[0, 1, 2]], [[0, 1], [2, 3]], [[0, 1, 2], [3, 4]]])
so_jaggered.counts

Current version 0.12.13 returns

array([1, 2, 2])

However, I want to count only the innermost part, which can be achieved by following code:

count_so_jaggered = np.array([[len(x) for x in trks] for trks in so_jaggered])

and output looks:

array([list([3]), list([2, 2]), list([3, 2])], dtype=object)

But it has at least two drawbacks: slow and dtype=object. Any plans to support such feature?

Jim Pivarski · Answer 1 · 2019-10-19T07:51:01.477

You can do this:

awkward.JaggedArray(so_jaggered.starts, so_jaggered.stops,
                    so_jaggered.content.counts)

which returns

<JaggedArray [[3] [2 2] [3 2]] at 0x797274f58630>

Also, there's a reducer method (like sum, min, max) that does this directly:

so_jaggered.count()

which returns

<JaggedArray [[3] [2 2] [3 2]] at 0x7b7fd8f53f60>

Notice that the property (returning outermost number of entries) is called counts with an "s" and requires no parentheses, while the reducer method (returning innermost number of entries) is called count without an "s" and requires parentheses. This was a design mistake and Awkward 1.0 will replace both with a single reducer that has an axis parameter (axis=0 returns the outermost level, axis=-1 returns the innermost, and other values are everywhere in between).

Also, reducers don't count missing values (None, from MaskedArrays, or NaN in floating-point), if you have any of those, which is another way that count differs from counts. This, too, should become an optional parameter to give the user more control. You found a weak point in the awkward-array interface.

(If you read this earlier, I expanded on the answer because I didn't remember `.count()` at first.) — Jim Pivarski, Oct 19 '19 at 07:51

Can JaggedArray counts innermost layer and return another JaggeredArray?

1 Answers1