1

I have 3 different groups summarised in a data frame. The data frame looks like:

d <- data.frame(v1 = c("A","A","A","B","B","B","C","C","C"), 

                v2 = c(1:9), stringsAsFactors = FALSE)

What I want is to compare the values of A against values of B. Also values of A against values of B and as a last comparison the values of B against the values of C

I constructed 2 for loops to iterate over v1 to extract the groups to compare. However, the for-loops give me all possible combinations like:

A vs. A

A vs. B

A vs. C

B vs. A

B vs. B

B vs. C

C vs. A and so on...

Here are my for-loops:

for(i in unique(d$v1)) {

    for(j in unique(d$v1)) {

        cat("i = ", i, "j = ", j, "\n")

        group1 <- d[which(d$v1 == i), ]

            group2 <- d[which(d$v1 == j), ]

        print(group1)
        print(group2)

        cat("---------------------\n\n")

    }
}

How can I manage to only iterate over data frame d so that in the first iteration group1 contains the values of A and group2 contains the values of B. In the second iteration group1 contains the values of A and group2 the values of C. And as a last comparisons group1 contains values of B and group2 contains values of C.

I am somehow totally stuck with that problem and hoping to find an answer here.

Cheers!

user969113
  • 2,349
  • 10
  • 44
  • 51
  • What kind of comparisons do you want to perform ? – Alan Jul 24 '12 at 10:10
  • well, instead of having only v2 my original data frame has up to 10 columns. So basically group1 is a sub data frame of d with 11 columns(c1=v1 and c2:11 = additional information) for v1 == A. Accordingly, group2 is the same subset but with the values for v1 == B. I hope that makes sense... – user969113 Jul 24 '12 at 10:17

2 Answers2

4

Perhaps something like this would work for you. With some more work, the output can be "tidied-up" a little bit too.

We'll use combn to find out the combinations, and lapply to subset our data based on the combinations:

temp = combn(unique(d$v1), 2)
temp
#     [,1] [,2] [,3]
# [1,] "A"  "A"  "B" 
# [2,] "B"  "C"  "C" 
lapply(1:ncol(temp), function(x) cbind(d[d$v1 == temp[1, x], ],
                                       d[d$v1 == temp[2, x], ]))
# [[1]]
#   v1 v2 v1 v2
# 1  A  1  B  4
# 2  A  2  B  5
# 3  A  3  B  6
# 
# [[2]]
#   v1 v2 v1 v2
# 1  A  1  C  7
# 2  A  2  C  8
# 3  A  3  C  9
# 
# [[3]]
#   v1 v2 v1 v2
# 4  B  4  C  7
# 5  B  5  C  8
# 6  B  6  C  9
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
2

I personally like mrdwab answer for its elegance, but if you still want to do it your way with the loops I'd fix it with this (take into account this will mess up your code, it's better to keep it tidy :)

u <- unique(d$v1)
for (i in 1:length(u)) {
    if (i < length(u)) {
        for (j in u[(i+1):length(u)]) {
            group1 <- d[which(d$v1 == u[i]), ]
            group2 <- d[which(d$v1 == j), ]
            cat("i = ", u[i], "j = ", j, "\n")
            print(group1)
            print(group2)
            cat("---------------------\n\n")
        }
    }
}

And result in this:

i =  A j =  B 
  v1 v2
1  A  1
2  A  2
3  A  3
  v1 v2
4  B  4
5  B  5
6  B  6
---------------------

i =  A j =  C 
  v1 v2
1  A  1
2  A  2
3  A  3
  v1 v2
7  C  7
8  C  8
9  C  9
---------------------

i =  B j =  C 
  v1 v2
4  B  4
5  B  5
6  B  6
  v1 v2
7  C  7
8  C  8
9  C  9
---------------------
julia
  • 152
  • 1
  • 7