My question may be worded confusingly so I'll clarify. Lets say I have two datasets. The first one (DS1) is made up of 10 (x,y) coordinates. The second one (DS2) is made up of 20 (x,y) coordinates.
My goal is to find which point in DS2 is closest to each point of DS1. So I would end up with, in this example, 10 distances.
BTW, I already wrote a working function that does this. But its SLOW. I did a brute force method with 2 nested for loops. Is there an established algorithm or package that does this faster?
EDIT: People have asked to see my code. I apologize in advance for those of you who have taken a formal algorithms class.
Generate_distances_list <- function(standard_object, experimental_expression, dist.method = "Euclidean") {
#Intialize distance_df
ncol_df <- 2
nrow_df <- nrow(standard_object$expression_ref)
distance_df <- data.frame(matrix(ncol = ncol_df , nrow = nrow_df))
distance_df[,ncol_df] <- 1:nrow_df
#Initialize list of distances
distance_df_list <- vector(mode = "list", length = nrow_df)
#Experimental ncol and nrow
nrow_experimental_expr <- nrow(experimental_expression)
ncol_experimental_expr <- ncol(experimental_expression)
for (i in 1:nrow_df) {
for (j in 1:nrow_experimental_expr) {
if (dist.method == "Euclidean") {
distance_df[j, 1] <- dist(rbind(standard_object$expression_ref[i,], experimental_expression[j,]))[1]
} else if (dist.method == "Manhattan") {
vec <- vector(length = ncol_experimental_expr)
for (k in 1:ncol_experimental_expr) {
vec[k] <- abs(standard_object$expression_ref[i, k] - experimental_expression[j, k])
}
distance_df[j, 1] <- sum(vec)
}
}
distance_df_list[[i]] <- distance_df
}
output <- distance_df_list
output
}