3

I need the minimal distance between elements of an array.

I did:

numpy.min(numpy.ediff1d(numpy.sort(x)))

Is there a better / more efficient / more elegant / faster way of doing this?

Charles Brunet
  • 21,797
  • 24
  • 83
  • 124

3 Answers3

4

If you are after sheer speed, here are some timings:

In [13]: a = np.random.rand(1000)

In [14]: %timeit np.sort(a)
10000 loops, best of 3: 31.9 us per loop

In [15]: %timeit np.ediff1d(a)
100000 loops, best of 3: 15.2 us per loop

In [16]: %timeit np.diff(a)
100000 loops, best of 3: 7.76 us per loop

In [17]: %timeit np.min(a)
100000 loops, best of 3: 3.19 us per loop

In [18]: %timeit np.unique(a)
10000 loops, best of 3: 53.8 us per loop

The timing of unique was in hopes that it would be comparably fast to sort, and you could break out early without the calls to diff and min if the length of the unique array was shorter than the array itself (as that would mean your answer was 0). But the overhead of unique is more than any gain to be made.

So it seems the only potential improvement I can offer is replacing ediff1d with diff:

In [19]: %timeit np.min(np.diff(np.sort(a)))
10000 loops, best of 3: 47.7 us per loop

In [20]: %timeit np.min(np.ediff1d(np.sort(a)))
10000 loops, best of 3: 57.1 us per loop
Jaime
  • 65,696
  • 17
  • 124
  • 159
  • 2
    Interresting. I was expecting `ediff1d` to be faster than `diff`, since it is for 1d arrays, but apparently `diff` is faster. – Charles Brunet Apr 11 '13 at 17:23
2

Your current approach is definitely optimal. By sorting first, you're reducing the space in between each element and ediff1d will return a difference array. Here's a suggestion:

Since we know that the difference must be positive since we have an ascending-order sort, we can implement ediff1d manually and include a break where the difference is zero. That way, if you have the sorted array x:

[1, 1, 2, 3, 4, 5, 6, 7, ... , n]

Rather than going through n elements, your ediff1d function breaks early and covers only the first two elements, returning [0]. This also reduces the size of the difference array, reducing the amount of iterations required by your min call.

Here is an example without the use of numpy:

x = [1, 12, 3, 8, 4, 1, 4, 9, 1, 29, 210, 313, 12]

def ediff1d_custom(x):
    darr = []

    for i in xrange(len(x)):
        if i != len(x) - 1:
            diff = x[i + 1] - x[i]
            darr.append(diff)

            if diff == 0:
                break

    return darr

print min(ediff1d_custom(sorted(x))) # prints 0
Daniel Li
  • 14,976
  • 6
  • 43
  • 60
0
try:
    min(x[i+1]-x[i] for i in xrange(0, len(x)-1))
except ValueError:
    print 'Array contains less than two values.'
Vladimir Chub
  • 461
  • 6
  • 19
  • `x` must still be sorted before you can do this, since this is basically just non-numpy `min(ediff1d(x))` – askewchan Apr 11 '13 at 17:09