1

I have the following arrays:

a = np.random.uniform(low=0, high=1, size=(5,3))
b = np.random.uniform(low=0, high=1, size=(2,3))

How can I concisely subtract each row from b from each row in a such that the result is of shape (5,2). Letting the result be c, c[i,j] is the euclidean distance between a[i,:] and b[j,:].

I can get this behavior when there is only 1 column:

a = np.random.uniform(low=0, high=1, size=(5,1))
b = np.random.uniform(low=0, high=1, size=(2,1))
print((a-b.T).shape)
That Guy
  • 2,349
  • 2
  • 12
  • 18
  • No I don't think so :) I have now updated the OP with a case where I get the behavior I want. – That Guy Jul 28 '21 at 18:22
  • 1
    Sorry, cannot understand. `c[i,j]` is a scalar, isn't it? And `a[i,:] - b[j,:]` is a vector. What am I missing? – Itay Jul 28 '21 at 18:26
  • Yes, you are right of course. Sorry! Anyways, the case I have shown where there is only 1 column shows the behavior I want(just replacing 1 column with 3) – That Guy Jul 28 '21 at 18:27
  • 1
    @Itay Ahhhhh, yes I want the norm which is a scalar. Yikes :) I updated the question. – That Guy Jul 28 '21 at 18:28
  • that seems `cdist(a, b)` with `from scipy.spatial.distance import cdist` – Mustafa Aydın Jul 28 '21 at 18:54
  • 1
    your example output for 1 column case is not Euclidean distance by the way. That's not a distance as it can be negative. – Mustafa Aydın Jul 28 '21 at 18:56
  • @MustafaAydın I understand, but the behavior should be the same, just replacing subtraction with norm :) – That Guy Jul 28 '21 at 19:06
  • I think you could do it with a kronecker product of ones array of correct shape to get two 3D arrays that you can directly subtract and then apply the norm along the 3rd dimension. Not sure if that's the best way, and I don't know the details, but it will at least be "concise" and vectorized. Probably some straightforward for loops would be easier to understand. – jpkotta Jul 28 '21 at 19:15

1 Answers1

0

Warning, I haven't looked at the result too carefully, but this should at least give you an idea.

Something like this:

a = np.random.uniform(low=0, high=1, size=(5,3))
b = np.random.uniform(low=0, high=1, size=(2,3))
aa = np.kron(np.ones((1,2,1)), a.reshape((5,1,3)))
bb = np.kron(np.ones((5,1,1)), b.reshape((1,2,3)))
aa.shape == bb.shape == (5,2,3) # rows are now axis 2
c = np.linalg.norm(aa-bb, axis=2) # shape = (5,2)

Of course this makes huge intermediate results. I think broadcasting might be better, but I'm not very familiar with that.

jpkotta
  • 9,237
  • 3
  • 29
  • 34