I got a question when using pdist
, it would be so many thanks if you could give me some advice. The pdist(D)
usually gives the sum of the distance for the multiple dimension, however, I want to get the distance separately. For example I have a data set S
which is a 10*2 matrix , I am using pdist(S(:,1))
and pdist(S(:,2))
to get the distance separately, but this seems very inefficient when the data has many dimensions. Is there any alternative way to achieve this more efficient? Thanks in advance!

- 189
- 2
- 12
2 Answers
Assuming you just want the absolute difference between the individual dimensions of the points then pdist
is overkill. You can use the following simple function
function d = pdist_1d(S)
idx = nchoosek(1:size(S,1),2);
d = abs(S(idx(:,1),:) - S(idx(:,2),:));
end
which returns the absolute pairwise difference between all pairs of rows in S
.
In this case
dist = pdist_1d(S)
gives the same result as
dist = cell2mat(arrayfun(@(dim)pdist(S(:,dim))',1:size(S,2),'UniformOutput',false));

- 19,885
- 5
- 47
- 66
-
Thank you @jodag, your approach is very useful. After I got the `dist`, if I would like to compute `exp(sum(10.*dist.^2,2))`, do you have any suggestions? – Zhida Deng Nov 19 '17 at 20:45
-
I doubt you'll find something faster than what you've written. – jodag Nov 19 '17 at 20:56
-
Or maybe the `bsxfun` would be faster? – Zhida Deng Nov 19 '17 at 21:11
Another option, since you're simply taking the absolute difference of the coordinates, is to use bsxfun
:
>> D = randi(20, 10, 2) % generate sample data
D =
17 12
14 10
8 4
7 11
19 13
2 18
11 14
5 19
19 12
20 8
From here, we permute the data so that the coordinates (columns) extend into the 3rd dimension and the rows are in the 1st dimension for the 1st argument, and the 2nd dimension for the 2nd argument:
>> dist = bsxfun(@(x,y)abs(x-y), permute(D, [1 3 2]), permute(D, [3 1 2]))
dist =
ans(:,:,1) =
0 3 9 10 2 15 6 12 2 3
3 0 6 7 5 12 3 9 5 6
9 6 0 1 11 6 3 3 11 12
10 7 1 0 12 5 4 2 12 13
2 5 11 12 0 17 8 14 0 1
15 12 6 5 17 0 9 3 17 18
6 3 3 4 8 9 0 6 8 9
12 9 3 2 14 3 6 0 14 15
2 5 11 12 0 17 8 14 0 1
3 6 12 13 1 18 9 15 1 0
ans(:,:,2) =
0 2 8 1 1 6 2 7 0 4
2 0 6 1 3 8 4 9 2 2
8 6 0 7 9 14 10 15 8 4
1 1 7 0 2 7 3 8 1 3
1 3 9 2 0 5 1 6 1 5
6 8 14 7 5 0 4 1 6 10
2 4 10 3 1 4 0 5 2 6
7 9 15 8 6 1 5 0 7 11
0 2 8 1 1 6 2 7 0 4
4 2 4 3 5 10 6 11 4 0
This results in a 3-d symmetric matrix where
dist(p, q, d)
gives you the distance between points p
and q
in dimension d
with
dist(p, q, d) == dist(q, p, d)
If you want the distances between p
and q
in all (or multiple) dimensions, you should use squeeze
to put it in a vector:
>> squeeze(dist(3, 5, :))
ans =
11
9
Note that if you're using MATLAB 2016b or later (or Octave) you can create the same distance matrix without bsxfun
:
dist = abs(permute(D, [1 3 2]) - permute(D, [3 1 2]))
The downside to this approach is that it creates the full symmetric matrix so you're generating each distance twice, which could potentially become a memory issue.

- 16,331
- 3
- 32
- 49
-
Thank you @beaker. This is a very good ideal, I am wondering that if there is some way existed to compute `exp(-10*dist(:,:,1).^2 - 5 *dist(:,:,2).^2)` efficiently. – Zhida Deng Nov 19 '17 at 23:16