14

I'm interested in getting the location of the minimum value in an 1-d NumPy array that meets a certain condition (in my case, a medium threshold). For example:

import numpy as np

limit = 3
a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])

I'd like to effectively mask all numbers in a that are under the limit, such that the result of np.argmin would be 6. Is there a computationally cheap way to mask values that don't meet a condition and then apply np.argmin?

Divakar
  • 218,885
  • 19
  • 262
  • 358
triphook
  • 2,915
  • 3
  • 25
  • 34
  • 1
    Could you explain why in your question you said np.argmin is 6? In this case it would be 0. If you masked all the numbers less than 3, then you'd get [4,5,5,3,6,7,9,10]. The np.argmin of this is still not 6. – OneRaynyDay Jun 22 '16 at 16:08
  • @OneRaynyDay My guess: the masked array the OP had in mind is `[--, --, 4, 5, --, 5, 3, 6, 7, 9, 10]`. Then the smallest element is 3, which is on position 6 (starting to count with 0) of the masked array. This is what happens in MaxPowers' answer. – Qaswed Mar 19 '20 at 16:28

2 Answers2

20

You could store the valid indices and use those for both selecting the valid elements from a and also indexing into with the argmin() among the selected elements to get the final index output. Thus, the implementation would look something like this -

valid_idx = np.where(a >= limit)[0]
out = valid_idx[a[valid_idx].argmin()]

Sample run -

In [32]: limit = 3
    ...: a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
    ...: 

In [33]: valid_idx = np.where(a >= limit)[0]

In [34]: valid_idx[a[valid_idx].argmin()]
Out[34]: 6

Runtime test -

For performance benchmarking, in this section I am comparing the other solution based on masked array against a regular array based solution as proposed earlier in this post for various datasizes.

def masked_argmin(a,limit): # Defining func for regular array based soln
    valid_idx = np.where(a >= limit)[0]
    return valid_idx[a[valid_idx].argmin()]

In [52]: # Inputs
    ...: a = np.random.randint(0,1000,(10000))
    ...: limit = 500
    ...: 

In [53]: %timeit np.argmin(np.ma.MaskedArray(a, a<limit))
1000 loops, best of 3: 233 µs per loop

In [54]: %timeit masked_argmin(a,limit)
10000 loops, best of 3: 101 µs per loop

In [55]: # Inputs
    ...: a = np.random.randint(0,1000,(100000))
    ...: limit = 500
    ...: 

In [56]: %timeit np.argmin(np.ma.MaskedArray(a, a<limit))
1000 loops, best of 3: 1.73 ms per loop

In [57]: %timeit masked_argmin(a,limit)
1000 loops, best of 3: 1.03 ms per loop
Community
  • 1
  • 1
Divakar
  • 218,885
  • 19
  • 262
  • 358
11

This can simply be accomplished using numpy's MaskedArray

import numpy as np

limit = 3
a = np.array([1, 2, 4, 5, 2, 5, 3, 6, 7, 9, 10])
b = np.ma.MaskedArray(a, a<limit)
np.ma.argmin(b)    # == 6
Guillem
  • 144
  • 4
  • 13
MaxPowers
  • 5,235
  • 2
  • 44
  • 69