Vectorized spatial distance between values in multidimensional arrays

Question

Given an (2,2,3,3,3) array of 3D-cartesian coordinates along the last dimension, what is the syntax for computing the Euclidean between pairwise values in XA and XB using scipy.spatial.distance.cdist to yield an output array with shape (2, 3, 3)?

XA = np.random.normal(size=(2,2,3,3,3))
XB = np.random.normal(size=(2,2,3,3,3))
dist = cdist(XA[:, 0, ...], XB[:, 1, ...], 'seuclidean')

Returns ValueError: XA must be a 2-dimensional array. As such, alternative to loops what is the pythonic syntax for computing cdist(XA[:, 0], XB[:, 1])?

If the coordinates are in the last axis, how you get an output shape of (2,3,3) out? — DSM, Sep 11 '18 at 23:40
@DSM I am not sure if that is where my own misunderstanding is, but both `XA[:, 0]` and `XB[:, 1]` have shapes `(2,3,3,3)` and given the distance computation along the last axis I expect the resulting array shape `(2,3,3)`. — Kambiz, Sep 11 '18 at 23:58
Ah, I think I see my confusion -- I think cdist may compute more distances than you care about. It'll compute all the cross distances, so basically num_XA_points * num_XB_points. But you only want the num_XA = num_XB pairwise distances, right? — DSM, Sep 12 '18 at 00:04
Oh damn! Yes, I only want num_XA = num_XB pairwise distances. So I probably need to be using `pdist` here, yes? — Kambiz, Sep 12 '18 at 00:07
I don't think so; that returns all the _internal_ cross distances in what it gets. I'm actually not sure of the right way to do what you want, but at least now I know what it is. :-) — DSM, Sep 12 '18 at 00:12
There are `2*2*3*3 = 36` points in `XA` (and same in `XB`), so, if I understand correctly, you should have 36 distances too... why the output dim equal to `(2, 3, 3)`? — xdze2, Sep 12 '18 at 20:15
@xdze2: the OP selects the 0th member of axis=1 for the left-hand-side and the 1st member of axis=1 for the right-hand side, so the second 2 in your multiplication isn't there. — DSM, Sep 13 '18 at 17:25

score 1 · Accepted Answer · answered Sep 12 '18 at 20:11

1

Is this doing the job? if only pairwise distances are needed, and the coordinates are on the last dimension:

import numpy as np

XA = np.random.normal(size=(2,2,3,3,3))
XB = np.random.normal(size=(2,2,3,3,3))

distances = np.sqrt(np.sum((XA - XB)**2, axis=-1))

but here distances.shape is (2, 2, 3, 3)...

answered Sep 12 '18 at 20:11

xdze2

3,986
2
12
29

1

Yes, that works. I was totally sidetracked by `np.cdist` and `np.pdist`. Just needed basic operation modules. I was also given an equivalent solution by a colleague... `np.linalg.norm(XA - XB, axis=-1)` – Kambiz Sep 17 '18 at 18:03

Vectorized spatial distance between values in multidimensional arrays

1 Answers1