I want to calculate the dissimilarity indices on a binary matrix and have found several functions in R, but I can't get them to agree. I use the jaccard coefficient as an example in the four functions: vegdist()
, sim()
, designdist()
, and dist()
. I'm going to use the result for a cluster analysis.
library(vegan)
library(simba)
#Create random binary matrix
function1 <- function(m, n) {
matrix(sample(0:1, m * n, replace = TRUE), m, n)
}
test <- function1(30, 20)
#Calculate dissimilarity indices with jaccard coefficient
dist1 <- vegdist(test, method = "jaccard")
dist2 <- sim(test, method = "jaccard")
dist3 <- designdist(test, method = "a/(a+b+c)", abcd = TRUE)
dist4 <- dist(test, method = "binary")
Does anyone know why dist1
and dist4
are different from dist2
and dist3
?