3

The code below finds the Euclidean distance between each element of list a and each element of list b.

from scipy.spatial import distance
a = [[1, 2, 3], [4, 5, 6]]
b = [[10, 20]]

Final_distance = []
for i in [j for sub in a for j in sub]:
    for k in [m for t in b for m in t]:
        dist = distance.euclidean(i, k)
        Final_distance.append(dist)
print(Final_distance)

The output is

[9.0, 19.0, 8.0, 18.0, 7.0, 17.0, 6.0, 16.0, 5.0, 15.0, 4.0, 14.0]

But for very large list it is taking very long time. Is there a way to reduce the time complexity of the above code?

sacuL
  • 49,704
  • 8
  • 81
  • 106
An student
  • 392
  • 1
  • 6
  • 16

1 Answers1

3

Since your euclidian distances are on scalars, it's equivalent to the absolute value between each point. So you can repeat your arrays in the appropriate order using np.repeat and np.tile, and just subtract your arrays from one another:

import numpy as np

a = [[1, 2, 3], [4, 5, 6]]
b = [[10, 20]]

a1 = np.array(a).flatten()
b1 = np.array(b).flatten()

Final_distance = np.abs(np.subtract(np.repeat(a1, len(b1)), np.tile(b1, len(a1))))

Which returns:

array([ 9, 19,  8, 18,  7, 17,  6, 16,  5, 15,  4, 14])
sacuL
  • 49,704
  • 8
  • 81
  • 106
  • Thanks for your answer. This is very faster than my approach. But for very large a and b i.e. if a contains 10000 lists and b contains 2000 lists, then this approach might brings problem with memory resulting in memory error. – An student Sep 04 '18 at 18:59