I want to use flexclust::distEuclidean
, but I'm not sure how centers
should be specified. There are no examples given in the documentation at ?distance
.
So I checked the source of this function (which turns out short):
function (x, centers)
{
if (ncol(x) != ncol(centers))
stop(sQuote("x"), " and ", sQuote("centers"), " must have the same number of columns")
z <- matrix(0, nrow = nrow(x), ncol = nrow(centers))
for (k in 1:nrow(centers)) {
z[, k] <- sqrt(colSums((t(x) - centers[k, ])^2))
}
z
}
<environment: namespace:flexclust>
If p
is a number of features in my data and k
is the number of clusters, should centers
be a k x p
matrix, i.e. consecutive centroids should be in rows?
Actually, it has to be like that, as this function first checks if ncol(x) = ncol(centers)
. But then we have
z[, k] <- sqrt(colSums((t(x) - centers[k, ])^2))
How t(x) - centers[k,]
even works? t(x)
is a p x n
matrix and centers[k, ]
is 1 x p
vector, so dimension don't match...