Two similar array-in-array containment tests. One passes, the other raises a ValueError. Why?

Question

Moar noob Python questions

I have a list of NumPy arrays and want to test if two arrays are inside. Console log:

>>> theArray
[array([[[213, 742]]], dtype=int32), array([[[127, 740]],
       [[127, 741]],
       [[128, 742]],
       [[127, 741]]], dtype=int32)]

>>> pair[0]
array([[[213, 742]]], dtype=int32)

>>> pair[1]
array([[[124, 736]]], dtype=int32)

>>> pair[0] in theArray
True

>>> pair[1] in theArray
Traceback (most recent call last):
  File "...\pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

pair[0] and pair[1] seem to have absolutely similar characteristics according to the debugger (except the contents). So how are these two cases different? Why could the second one fail while the first does not?

Looks like the second array contains multiple arrays. if you look at array([[[ <- 3 [ — Alex, Oct 11 '17 at 21:35
`in` for lists assumes that `==` is an equivalence relation on its elements. `==` isn't an equivalence relation on NumPy arrays; it doesn't even return a boolean. Using `in` here is essentially meaningless. — user2357112, Oct 11 '17 at 21:39
I cannot reproduce this, actually, mine fails for both. I'm on numpy `'1.13.1'` — juanpa.arrivillaga, Oct 11 '17 at 21:44
@user2357112. Oh. Is there an python-elegant way to check if a list contains a certain NumPy array then? Basically I treat arrays as entries and try to do some processing with them — Alex, Oct 11 '17 at 21:44
@juanpa.arrivillaga: `pair[0]` would have to be the same object as `theArray[0]` to get True instead of a ValueError. — user2357112, Oct 11 '17 at 21:49

score 2 · Accepted Answer · answered Oct 11 '17 at 21:46

Using in at all here is a mistake.

theArray isn't an array. It's a list. in for lists assumes that == is an equivalence relation on its elements, but == isn't an equivalence relation on NumPy arrays; it doesn't even return a boolean. Using in here is essentially meaningless.

Making theArray an array wouldn't help, because in for arrays makes basically no sense.

pair[0] in theArray happens to not raise an exception because of an optimization lists perform. Lists try an is comparison before == for in, and pair[0] happens to be the exact same object as the first element of theArray, so the list never gets around to trying == and being confused by its return value.

If you want to check whether a specific object obj is one of the elements of a list l (not just ==-equivalent to one of the elements, but actually that object), use any(obj is element for element in l).

If you want to check whether a NumPy array is "equal" to an array in a list of arrays in the sense of having the same shape and equal elements, use any(numpy.array_equal(obj, element) for element in l).

Yes, `theArray` is a bad name for a list =) Thanks a lot for showing how the things work! The brains stopped boiling finally =) — Alex, Oct 11 '17 at 21:50

score 0 · Answer 2 · answered Oct 11 '17 at 21:48

I get the ValueError for both success and failure cases.

as @user2357112 said, the issue is that the elements of the list are numpy arrays, so the == comparison which 'in' depends on doesn't work

but you can use a construction like:

any(np.all(x == p[0]) for x in theArray)

Two similar array-in-array containment tests. One passes, the other raises a ValueError. Why?

2 Answers2