I'm trying to calculate the
1) Euclidean distance, and
2) Mahalanobis distance
for a set of matricies in r. I've been doing it as such:
v1 <- structure(c(0.508, 0.454, 0, 2.156, 0.468, 0.488, 0.682, 1, 1.832,
0.44, 0.928, 0.358, 1, 1.624, 0.484, 0.516, 0.378, 1, 1.512,
0.514, 0.492, 0.344, 0, 1.424, 0.508, 0.56, 0.36, 1, 1.384, 0.776,
1.888, 0.388, 0, 1.464, 0.952, 0.252, 0.498, 1, 1.484, 0.594,
0.256, 0.54, 2, 2.144, 0.402, 0.656, 2.202, 1, 1.696, 0.252),
.Dim = c(5L, 10L),
.Dimnames = list(NULL, c("KW_1", "KW_2", "KW_3", "KW_4", "KW_5", "KW_6", "KW_7", "KW_8", "KW_9", "KW_10")))
v2 <- structure(c(1.864, 1.864, 1.864, 1.864, 1.864, 1.6, 1.6, 1.6,
1.6, 1.6, 1.536, 1.536, 1.536, 1.536, 1.536, 1.384, 1.384, 1.384,
1.384, 1.384, 6.368, 6.368, 6.368, 6.368, 6.368, 2.792, 2.792,
2.792, 2.792, 2.792, 2.352, 2.352, 2.352, 2.352, 2.352, 2.624,
2.624, 2.624, 2.624, 2.624, 1.256, 1.256, 1.256, 1.256, 1.256,
1.224, 1.224, 1.224, 1.224, 1.224),
.Dim = c(5L, 10L),
.Dimnames = list(NULL, c("KW_1", "KW_2", "KW_3", "KW_4", "KW_5", "KW_6", "KW_7", "KW_8", "KW_9", "KW_10")))
L2 <- sqrt(rowSums((v1-v2)^2)) # Euclidean distance for each row
which provides:
[1] 7.132452 7.568359 7.536904 5.448696 7.163580
That's perfect! But I've heard you can also compute Euclidean/L2 distance using the following form:
I'd like to calculate my distance this way because the Mahalanobis distance is simply this and the covariance matrix. See this.
I haven't figured out how to code this in r, however. I've tried:
sqrt(crossprod((t(v1)-t(v2))))
and
sqrt((v1-v2) %*% t(v1-v2))
But they just don't give me what I want. Suggestions?
Note -
I'm looking to do this as a single operation, not in a loop of any kind. It has to be very fast because I'm doing it over millions of rows multiple times. Maybe it's not possible. I'm open to changing the format of v1
and v2
.