I'm helping to put together a spatial R lab for a third year class, and one of the tasks will be to identify a specific site that is located the closest (i.e. mean shortest distance) to a set of multiple other sites.
I have a distance matrix dist_m
that I produced by using the gdistance::costDistance
which looks something like this:
# Sample data
m <- matrix(c(2, 1, 8, 5,
7, 6, 3, 4,
9, 3, 2, 8,
1, 3, 7, 4),
nrow = 4,
ncol = 4,
byrow = TRUE)
# Sample distance matrix
dist_m <- dist(m)
dist_m
when printed looks like:
1 2 3
2 8.717798
3 9.899495 5.477226
4 2.645751 7.810250 10.246951
Desired output: From this dist I want to be able to identify the index value (1
, 2
, 3
or 4
) that has the lowest average distance. In this example, it would be index 4
, which has an average distance of 6.90
. Ideally, I'd also like the mean distance returned too (6.90
).
I can find the mean distance of an individual index by doing something like this:
# Convert distance matrix to matrix
m = as.matrix(dist_m)
# Set diagonals and upper triangle to NA
m[upper.tri(m)] = NA
m[m == 0] = NA
# Calculate mean for index
mean(c(m[4,], m[,4]), na.rm = TRUE)
However, I ideally want a solution that either identifies the index with the minimum mean distance directly, rather than having to plug in index values manually (the actual dataset will be much larger than this).
As this is for a university class, I'd like to keep any solution as simple as possible: for-loops and apply functions are likely to be difficult to grasp for students with little experience in R.