1

I have an array in which I want to find the index of the smallest elements. I have tried the following method:

distance = [2,3,2,5,4,7,6]

a = distance.index(min(distance))

This returns 0, which is the index of the first smallest distance. However, I want to find all such instances, 0 and 2. How can I do this in Python?

Nathaniel
  • 3,230
  • 11
  • 18
Nia.T
  • 13
  • 3

5 Answers5

4

Use np.where to get all the indexes that match a given value:

import numpy as np

distance = np.array([2,3,2,5,4,7,6])

np.where(distance == np.min(distance))[0]

Out[1]: array([0, 2])

Numpy outperforms other methods as the size of the array grows:

Results of TimeIt comparison test, adapted from Yannic Hamann's code below

                     Length of Array x 7
Method               1       10      20      50     100    1000
Sorted Enumerate     2.47  16.291  33.643                      
List Comprehension  1.058   4.745   8.843  24.792              
Numpy               5.212   5.562   5.931    6.22  6.441  6.055
Defaultdict         2.376   9.061  16.116  39.299              

Plot of timing results

Community
  • 1
  • 1
Nathaniel
  • 3,230
  • 11
  • 18
  • Interesting. I would not have expected that. I wonder why that is the case? – Nathaniel Mar 15 '19 at 20:38
  • when you compare distance of the type ``numpy.ndarray`` with an integer it always evaluates the FULL array. – Yannic Hamann Mar 15 '19 at 20:47
  • I think this brings up an important point: numpy shines in efficient computation with very large arrays. For small arrays, numpy may not be the most efficient, as you have clearly pointed out. But numpy is much more scalable than most other methods. [This discussion](https://stackoverflow.com/questions/993984/what-are-the-advantages-of-numpy-over-regular-python-lists) contains some relevant explanation. – Nathaniel Mar 15 '19 at 20:54
  • No problem. I made the plot above in Excel, because it was quick. – Nathaniel Mar 16 '19 at 06:46
  • 1
    I re-made the plot using Matplotlib. – Nathaniel Mar 16 '19 at 07:07
1

You may enumerate array elements and extract their indexes if the condition holds:

min_value = min(distance)
[i for i,n in enumerate(distance) if n==min_value]
#[0,2]
DYZ
  • 55,249
  • 10
  • 64
  • 93
  • 1
    Why would you calculate the minimum for every iteration? – G_M Mar 15 '19 at 19:49
  • @YannicHamann Nothing surprising at all. I ran these tests before posting my answer. NumPy is not a silver bullet. – DYZ Mar 15 '19 at 20:32
1

Surprisingly the numpy answer seems to be the slowest.

Update: Depends on the size of the input list.

import numpy as np
import timeit
from collections import defaultdict


def weird_function_so_bad_to_read(distance):
    se = sorted(enumerate(distance), key=lambda x: x[1])
    smallest_numb = se[0][1]  # careful exceptions when list is empty
    return [x for x in se if smallest_numb == x[1]]
    # t1 = 1.8322973089525476


def pythonic_way(distance):
    min_value = min(distance)
    return [i for i, n in enumerate(distance) if n == min_value]
    # t2 = 0.8458914929069579


def fastest_dont_even_have_to_measure(np_distance):
    # np_distance = np.array([2, 3, 2, 5, 4, 7, 6])
    min_v = np.min(np_distance)
    return np.where(np_distance == min_v)[0]
    # t3 = 4.247801031917334


def dd_answer_was_my_first_guess_too(distance):
    d = defaultdict(list)  # a dictionary where every value is a list by default

    for idx, num in enumerate(distance):
        d[num].append(idx)  # for each number append the value of the index

    return d.get(min(distance))
    # t4 = 1.8876687170704827


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped


distance = [2, 3, 2, 5, 4, 7, 6]

t1 = wrapper(weird_function_so_bad_to_read, distance)
t2 = wrapper(pythonic_way, distance)
t3 = wrapper(fastest_dont_even_have_to_measure, np.array(distance))
t4 = wrapper(dd_answer_was_my_first_guess_too, distance)

print(timeit.timeit(t1))
print(timeit.timeit(t2))
print(timeit.timeit(t3))
print(timeit.timeit(t4))
Yannic Hamann
  • 4,655
  • 32
  • 50
  • 1
    I ran some additional tests using your code which show how numpy performs well even as the array size increases dramatically. – Nathaniel Mar 15 '19 at 21:36
0

You can also do the following list comprehension

distance = [2,3,2,5,4,7,6]
min_distance = min(distance)
[index for index, val in enumerate(distance) if val == min_distance]
>>> [0, 2]
Samuel Nde
  • 2,565
  • 2
  • 23
  • 23
  • How is this different from my previously posted answer? – DYZ Mar 15 '19 at 19:45
  • 1
    @DYZ I think we both posted the answer at the same time. Or do you have any reason to suggest that my answer came from your? What if I turned around and asked you the same question? – Samuel Nde Mar 15 '19 at 19:49
  • 4
    Calculating the minimum for every iteration seems wasteful in both of your answers. – G_M Mar 15 '19 at 19:50
0

We can use an interim dict to store indices of the list and then just fetch the minimum value of distance from it. We will also use a simple for-loop here so that you can understand what is happening step by step.

from collections import defaultdict

d = defaultdict(list) # a dictionary where every value is a list by default

for idx, num in enumerate(distance):
    d[num].append(idx) # for each number append the value of the index

d.get(min(distance)) # fetch the indices of the min number from our dict

[0, 2]
gold_cy
  • 13,648
  • 3
  • 23
  • 45