First, I think you mean you have
jet = ak.Array({
"constituents": ak.Array([[0, 1, 3, 4], [2]]),
"energy": ak.Array([1.2, 3.4])
})
because I'd expect the "constituents"
indexes to be 0-based, not 1-based. But even if it is 1-based, just start by subtracting 1.
>>> jet.constituents - 1
<Array [[0, 1, 3, 4], [2]] type='2 * var * int64'>
The biggest problem here is that these indexes are nested one level deeper than the particles_p4
that you want to slice. You want the 0
, 1
, 3
, 4
, and also the 2
, in your jet.constituents
to be indexes in the not-nested list, particles_p4
.
If we just arbitrarily flatten them (axis=-1
means to squash the last/deepest dimension):
>>> ak.flatten(jet.constituents, axis=-1)
<Array [0, 1, 3, 4, 2] type='5 * int64'>
these indexes are exactly what you'd need to apply to particles_p4
. Here, I'm using the current (2.x) version of Awkward Array, so that I can use .show()
, but the integer-array slice works in any version of Awkward Array.
>>> particles_p4[ak.flatten(jet.constituents, axis=-1)].show(type=True)
type: 5 * Momentum4D[
x: int64,
y: int64,
z: int64,
tau: int64
]
[{x: 1, y: 1, z: 1, tau: 1},
{x: 2, y: 2, z: 2, tau: 2},
{x: 4, y: 4, z: 4, tau: 4},
{x: 5, y: 5, z: 5, tau: 5},
{x: 3, y: 3, z: 3, tau: 3}]
If we take that as a partial solution, all we need to do now is put the nested structure back into the result.
ak.flatten has an opposite, ak.unflatten, which takes a flat array and adds nestedness from an array of list lengths. You can get the list lengths from the original jet.constituents
with ak.num. Again, I'll use axis=-1
so that this answer will generalize to deeper nestings.
>>> lengths = ak.num(jet.constituents, axis=-1)
>>> lengths
<Array [4, 1] type='2 * int64'>
>>> rearranged = particles_p4[ak.flatten(jet.constituents, axis=-1)]
>>> rearranged
<MomentumArray4D [{x: 1, y: 1, z: 1, tau: 1}, ..., {...}] type='5 * Momentu...'>
>>> result = ak.unflatten(rearranged, lengths, axis=-1)
>>> result.show(type=True)
type: 2 * var * Momentum4D[
x: int64,
y: int64,
z: int64,
tau: int64
]
[[{x: 1, y: 1, z: 1, tau: 1}, {x: 2, ...}, ..., {x: 5, y: 5, z: 5, tau: 5}],
[{x: 3, y: 3, z: 3, tau: 3}]]
For the bonus round, if all of the above arrays (particles_p4
and jet
) were arrays of lists, where each list represents one event, rather than an array representing one event, then the above would hold. I'm taking it as a given that the length of the particles_p4_by_event
is equal to the length of the jet_by_event
arrays, and the values of jet_by_event.constituents
are indexes within each event in particles_p4_by_event
(not global indexes; each event should restart at zero). That is, all of your arrays agree on how many events there are, and each event is handled individually, with no cross-over between events.