0

I need to perform a calculation on a JaggedArray, but only if the elements in the JaggedArray are in contained in another JaggedArray. I'd like to receive back a mask with True if the element in is another JaggedArray, or False otherwise (ie. should be a np.array). I've been looking for a way to do this in awkward-array version 0 or 1. However, I haven't be able to find a direct way to do so. in doesn't appear to work, and I haven't found an equivalent for np.isin(...) (mentioned in this issue, but seems to have been closed without a replacement). To be concrete, I'm looking for:

import awkward as ak
import numpy as np

# Example arrays:
full_array = ak.fromiter([[1,2,3], [], [0,1,2,3,4,5]])
selected_array = ak.fromiter([[2], [], [7]])
# Desired output
desired_output = np.array([True, False, False])

For awkward0, I get:

>>> selected_array in full_array
~/.venv/lib/python3.7/site-packages/awkward/array/base.py in __bool__(self)
    138
    139     def __bool__(self):
--> 140         raise ValueError("The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()")
    141
    142     __nonzero__ = __bool__

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I also tried some other variations of things like selected_array.pad(1).fillna(-10).flatten() in full_array without success. I did find a workaround, but it seems rather indirect:

workaround_array = full_array.ones_like() * selected_array.pad(1).fillna(-100).flatten()
assert (desired_output == (workaround_array == full_array).any()).all()

For awkward1, I get a result, but it appears to be wrong (or I'm not sure what it means).

>>> import awkward1 as ak1
... ak1_full_array = ak1.from_awkward0(full_array)
... ak1_selected_array = ak1.from_awkward0(selected_array)
>>> ak1_selected_array in ak1_full_array
True

Is there a more direct way of testing for elements in a JaggedArray? Am I somehow misusing in? What about the case of more than one value per JaggedArray entry, where the workaround doesn't work?

Thanks!

1 Answers1

0

I have what is maybe at least a more direct method, but I still don't think it is great. I'd also be interested in seeing other solutions.

output = ak.fromiter([np.isin(selected_array[index], full_array[index]) 
                      for index in range(len(selected_array))])
# This is a JaggedArray with entries [[True], [], [False]]
# output.any() will then match desired_output
Will
  • 33
  • 1
  • 5