I have these 10 numeric vectors. For simplicity, each containing 5 elements
a <- c(1,2,3,4,5)
b <- c(1,2,3,4,6)
c <- c(1,2,3,4,6)
d <- c(1,2,3,4,6)
e <- c(6,2,9,7,3)
f <- c(7,3,5,7,6)
g <- c(7,9,3,4,0)
h <- c(4,6,4,6,9)
i <- c(8,8,5,3,8)
j <- c(2,1,1,2,3)
I want to find 3 most related/similar vectors. It must be vector b, c, d.
Additionally, I also hoping to get another vectors composition besides the "most related" one (b, c, d). In this case, could be: (a, b, c)
, (a, b, d)
, (a, c, d )
.
The level of relation/similarity itself should have score
so I can find the most similar, second most similar etc.
Expected output is like this, more or less
similarity_rank vectors similarity_score (example)
1 b, c, d 0.99
2 a, b, c 0.8
etc.
My trial so far: I'm using pairwise correlation. It can find the relation between vectors but only 2 vectors. I want to get "similarity score" for those 3 vector (or for general purpose, n vectors)
Rules:
- n: Number of desired vectors
- N: Number of all vectors
- N > n
- All vectors are numeric
Question: What is the best method to do that? (R code will be amazing, R Package will be great, or only the method name is enough so I can learn about it)