Please have a look at this code:
import numpy as np
from scipy.spatial import distance
#1
X = [[0,0], [0,1], [0,2], [0,3], [0,4], [0,5]]
c = [[0,0], [0,1], [0,3]]
#2
dists = distance.cdist(X, c)
print(dists)
#3
dmini = np.argmin(dists, axis=1)
print(dmini)
#4
mindists = dists[:, dmini]
print(mindists)
(#1) So I have my data X
, some other points (centroids) c
, then (#2) I compute the distance from each point in X
to all the centroids c
, and store the result in dists
.
(#3) Then I select the index of the minimum distances with argmin
.
(#4) Now I only want to select the value of the minimum values, using the indexes computed in step #3.
However, I get a strange output.
# dists
[[ 0. 1. 3.]
[ 1. 0. 2.]
[ 2. 1. 1.]
[ 3. 2. 0.]
[ 4. 3. 1.]
[ 5. 4. 2.]]
#dmini
[0 1 1 2 2 2]
#mindists
[[ 0. 1. 1. 3. 3. 3.]
[ 1. 0. 0. 2. 2. 2.]
[ 2. 1. 1. 1. 1. 1.]
[ 3. 2. 2. 0. 0. 0.]
[ 4. 3. 3. 1. 1. 1.]
[ 5. 4. 4. 2. 2. 2.]]
Reading here and there, it seems possible to select specific columns by giving a list of integers (indexes). In this case I should use the dmini
values for indexing columns along rows.
I was expecting mindists
to be (6,)
in shape. What am I doing wrong?