4

I have the following problem in Python I need to solve:

Given two coordinate matrices (NumPy ndarrays) A and B, find for all coordinate vectors a in A the corresponding coordinate vectors b in B, such that the Euclidean distance ||a-b|| is minimized. The coordinate matrices A and B can have different number of coordinate vectors (that is, different number of rows).

This method should return a matrix of coordinate vectors C where the ith vector c in C is the vector from B that minimizes the Euclidean distance with the ith coordinate vector a in A.

For example, lets say

A = np.array([[1,1], [3,4]]) and B = np.array([[1,2], [3,6], [8,1]])

The Euclidean distances between the vector [1,1] in A and the vectors in B are:

1, 5.385165, 7

So the first vector in C would be [1,2]

Similarly the distances for the vector [3,4] in A and the vectors in B are:

2.828427, 2, 5.830952  

So the second and last vector in C would be [3,6]

So C = [[1,2], [3,6]]

How to code this efficiently in Python?

Divakar
  • 218,885
  • 19
  • 262
  • 358
jjepsuomi
  • 4,223
  • 8
  • 46
  • 74

1 Answers1

5

You could use cdist from scipy.spatial.distance to efficiently get the euclidean distances and then use np.argmin to get the indices corresponding to minimum values and use those to index into B for the final output. Here's the implementation -

import numpy as np
from scipy.spatial.distance import cdist

C = B[np.argmin(cdist(A,B),1)] 

Sample run -

In [99]: A
Out[99]: 
array([[1, 1],
       [3, 4]])

In [100]: B
Out[100]: 
array([[1, 2],
       [3, 6],
       [8, 1]])

In [101]: B[np.argmin(cdist(A,B),1)]
Out[101]: 
array([[1, 2],
       [3, 6]])
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Excellent, thank you! =) I'm doing computationally heavy calculations so that's why efficiency is important in my problem :) – jjepsuomi Jun 12 '15 at 10:14