I have a dataframe with many rows, and each row contains a sample ID and two samples, which I am treating as coordinates. I want to calculate the euclidean distance between each set of coordinates (i.e., each row) to generate a distance matrix comparing each sample. I'm having trouble using dist
because it seems like I should be subdividing my dataframe or comparing two separate ones, and I'm not looking for pairwise comparisons of x
and y
; I just want to know the distance between each sample in my dataframe.
Here is an example dataframe:
sample <- c("s1","s2","s3")
x <- c(12,10,5)
y <- c(8,6,15)
df <- data.frame(sample, x, y)
which I would like to produce a 3x3 matrix of distances. This seems like it should be easy to do, I might just be missing a keyword.