0

I have a hundred arrays of 288 and I need to compute a distance matrix for all of the arrays, it works with the code below in aprox. 10 seconds. Is there a more efficient way to do this because it needs to be done with 50000 arrays and it takes too much time.

dist=np.zeros((100,100))
pf = np.array(purpose_fin)
  for i in range(100):
   for j in range(100):
    dist[i][j] = 288-sum(np.equal(pf[i],pf[j]))
  • Does this answer your question? [Python - How to generate the Pairwise Hamming Distance Matrix](https://stackoverflow.com/questions/42752610/python-how-to-generate-the-pairwise-hamming-distance-matrix) – jez Dec 11 '19 at 18:24

1 Answers1

1

Using Scipy library

from scipy.spatial import distance_matrix
distance_matrix([[0,0],[0,1]], [[1,0],[1,1]])

scipy.spatial.distance_matrix

Hamming Distance

from scipy.spatial import distance
distance.hamming([1, 0, 0], [0, 1, 0])

Hamming Distance

4.Pi.n
  • 1,151
  • 6
  • 15
  • 1
    NB: the OP says "distance" but, judging by the use of a counting strategy over `np.equal`, probably specifically wants Hamming distance. – jez Dec 11 '19 at 18:29