3

Suppose I have a column vector y with length n, and I have a matrix X of size n*m. I want to check for each element i in y, whether the element is in the corresponding row in X. What is the most efficient way of doing this?

For example:

y = [1,2,3,4].T and

X =[[1, 2, 3],[3, 4, 5],[4, 3, 2],[2, 2, 2]]

Then the output should be

[1, 0, 1, 0] or [True, False, True, False] 

which ever is easier.

Of course we can use a for loop to iterate through both y and X, but is there any more efficient way of doing this?

kmario23
  • 57,311
  • 13
  • 161
  • 150
zycuber
  • 83
  • 1
  • 7
  • Are you sure you don't want the output to be `np.array([True, False, True, False])`? – Eric Oct 21 '16 at 11:30

1 Answers1

4

Vectorized approach using broadcasting -

((X == y[:,None]).any(1)).astype(int)

Sample run -

In [41]: X        # Input 1
Out[41]: 
array([[1, 2, 3],
       [3, 4, 5],
       [4, 3, 2],
       [2, 2, 2]])

In [42]: y        # Input 2
Out[42]: array([1, 2, 3, 4])

In [43]: X == y[:,None] # Broadcasted  comparison
Out[43]: 
array([[ True, False, False],
       [False, False, False],
       [False,  True, False],
       [False, False, False]], dtype=bool)

In [44]: (X == y[:,None]).any(1) # Check for any match along each row
Out[44]: array([ True, False,  True, False], dtype=bool)

In [45]: ((X == y[:,None]).any(1)).astype(int) # Convert to 1s and 0s
Out[45]: array([1, 0, 1, 0])
Divakar
  • 218,885
  • 19
  • 262
  • 358