I need the minimal distance between elements of an array.
I did:
numpy.min(numpy.ediff1d(numpy.sort(x)))
Is there a better / more efficient / more elegant / faster way of doing this?
I need the minimal distance between elements of an array.
I did:
numpy.min(numpy.ediff1d(numpy.sort(x)))
Is there a better / more efficient / more elegant / faster way of doing this?
If you are after sheer speed, here are some timings:
In [13]: a = np.random.rand(1000)
In [14]: %timeit np.sort(a)
10000 loops, best of 3: 31.9 us per loop
In [15]: %timeit np.ediff1d(a)
100000 loops, best of 3: 15.2 us per loop
In [16]: %timeit np.diff(a)
100000 loops, best of 3: 7.76 us per loop
In [17]: %timeit np.min(a)
100000 loops, best of 3: 3.19 us per loop
In [18]: %timeit np.unique(a)
10000 loops, best of 3: 53.8 us per loop
The timing of unique
was in hopes that it would be comparably fast to sort
, and you could break out early without the calls to diff
and min
if the length of the unique array was shorter than the array itself (as that would mean your answer was 0
). But the overhead of unique
is more than any gain to be made.
So it seems the only potential improvement I can offer is replacing ediff1d
with diff
:
In [19]: %timeit np.min(np.diff(np.sort(a)))
10000 loops, best of 3: 47.7 us per loop
In [20]: %timeit np.min(np.ediff1d(np.sort(a)))
10000 loops, best of 3: 57.1 us per loop
Your current approach is definitely optimal. By sorting first, you're reducing the space in between each element and ediff1d
will return a difference array. Here's a suggestion:
Since we know that the difference must be positive since we have an ascending-order sort, we can implement ediff1d
manually and include a break where the difference is zero. That way, if you have the sorted array x
:
[1, 1, 2, 3, 4, 5, 6, 7, ... , n]
Rather than going through n elements, your ediff1d
function breaks early and covers only the first two elements, returning [0]
. This also reduces the size of the difference array, reducing the amount of iterations required by your min
call.
Here is an example without the use of numpy:
x = [1, 12, 3, 8, 4, 1, 4, 9, 1, 29, 210, 313, 12]
def ediff1d_custom(x):
darr = []
for i in xrange(len(x)):
if i != len(x) - 1:
diff = x[i + 1] - x[i]
darr.append(diff)
if diff == 0:
break
return darr
print min(ediff1d_custom(sorted(x))) # prints 0
try:
min(x[i+1]-x[i] for i in xrange(0, len(x)-1))
except ValueError:
print 'Array contains less than two values.'