Comparing numpy arrays with individual values

Question

I have a numpy-array "target_tokes" with many values. I try to receive a numpy_array of the same shape, with 1. at a position where in the target_tokens array I had a specific value (i.e. a nine or a two).

This works (for the nine):

i_factor        = (target_tokens == 9).astype(np.float32)

Result:

[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  1.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]...

This does not work:

group           = [2, 9]
i_factor        = (target_tokens in group).astype(np.float32)

Result is:

i_factor        = (target_tokens in group).astype(np.float32) 
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Why is that and how can I still achieve my result without having big loops (the group is bigger in reality than just two values).

Thx

Not sure if I got it, how about : `np.asarray(target_tokens)[:,None] == range(9)`? — Divakar, Oct 26 '17 at 16:24

score 2 · Answer 1 · edited Jun 20 '20 at 09:12

You can use a `bitwise operator`

Let's first simplify what you are trying to achieve with a more basic array:

a = np.array([1, 2, 7, 3, 9, 6])

and the numbers you want to check against:

g = [2, 9]

To get an array of 1s and 0s representing whether or not each element is equal to either of the elements in g, we can use the bitwise or which is '|':

((a == g[0]) | (a == g[1])).astype(np.float32)

which gives:

array([ 0.,  1.,  0.,  0.,  1.,  0.], dtype=float32)

This will also work for higher dimensional arrays.

Such as with:

a = np.array([[1, 5, 7], [9, 3, 2], [5, 8, 9]])

which (with the same g) will give:

array([[ 0.,  0.,  0.],
       [ 1.,  0.,  1.],
       [ 0.,  0.,  1.]], dtype=float32)

Note that you can also achieve the same thing with np.bitwise_or() which is what you would need to use if you wanted the g list to be of any size.

If you wanted to allow g to be any size, you could no longer use the bitwise or '|' operand unless you wrote a for-loop to do it in. So to escape a for-loop, we could use np.bitwise_or.reduce on the arrays.

So with the original array:

a = np.array([1, 2, 7, 3, 9, 6])

but now with a longer g:

g = [1, 7, 9, 4]

we can use the np.bitwise_or.reduce:

np.bitwise_or.reduce([a == e for e in g]).astype(np.float32)

which gives:

array([ 1.,  0.,  1.,  0.,  1.,  0.], dtype=float32)

Learned a lot and like it, even though the simpler solution np.isin() is what I implemented here. Still thx1 — Phillip Bock, Oct 27 '17 at 08:57

AGN Gazer · Answer 2 · 2017-10-26T17:17:22.870

A couple of options besides the bitwise-OR described in @JoeIddon's solution.

One solution is based on @Divakar's comment:

group = [1, 9]
a = np.array([1, 1, 2, 3, 4, 1, 9, 9, 2])
(np.asarray(group)[:,None] == a).sum(axis=0)

or, if you need np.float32 type:

(np.asarray(group)[:,None] == a).sum(axis=0, dtype=np.float32)

Another one is to use list comprehension, equality testing for each test value in the group, and add solutions:

group = [1, 9]
a = np.array([1, 1, 2, 3, 4, 1, 9, 9, 2])
np.sum(a == g for g in group)

or, if you need np.float32 type:

np.sum((a == g for g in group), dtype=np.float32)

In both cases the answer will be:

array([1, 1, 0, 0, 0, 1, 1, 1, 0]) # or float32

My sol uses `bitwise-or`! Nice use of `sum` though +1 – Joe Iddon Oct 26 '17 at 17:02 — Joe Iddon, Oct 26 '17 at 17:02

score 2 · Accepted Answer · answered Oct 26 '17 at 16:58

Like and and or, in isn't allowed to broadcast. The Python language requires that in always returns a boolean. Also, only the right-hand operand can define what in means, and you used a list, not an array. You're getting the in behavior of Python lists.

NumPy's in operator is pretty weird and not useful for you. in for lists makes more sense, but still isn't what you need. You need numpy.isin, which behaves like an in test broadcasted across its left operand (but not its right):

numpy.isin(target_tokens, group).astype(np.float32)

Comparing numpy arrays with individual values

3 Answers3

You can use a bitwise operator

You can use a `bitwise operator`