0

I am trying to exclude all combinations generated by combn function that consists of "var4" and "var5". Here below is the code that does not work at the moment:

mod_headers <- c("var1", "var2", "var3", "var4", "var5", "var6")

f <- function(){
  for(i in 1:length(mod_headers)){
    tab <- combn(mod_headers,i,function(mod_headers){
      if (combn(mod_headers,i) %in% c("var4","var5")) {return()}
    })
    for(j in 1:ncol(tab)){
      tab_new <- c(tab[,j])
      mod_tab_new <- c(tab_new, "newcol")
      print(mod_tab_new)
    }
  }
}

f()

Thanks for your help!

New2coding
  • 715
  • 11
  • 23

5 Answers5

1

I'm not really sure how you want your result to be formatted, so I stopped at getting the combinations that exclude the appearance of two values together. It relies on the fact that combn returns a matrix where each column is a combination.

mod_headers <- c("var1", "var2", "var3", "var4", "var5", "var6")


combn_with_exclusion <- function(x, n, exclude){
  full <- combn(x, n)
  # remove any columns that have all elements of `exclude`
  full[, !apply(full, 2, function(y) all(exclude %in% y))]
}

combn_with_exclusion(mod_headers, 2, c("var4", "var5"))
Benjamin
  • 16,897
  • 6
  • 45
  • 65
  • Thanks, your script provides me with the desired outcome, however, I would not like to change the structure of my code because it is a part of more robust code (I just deleted all the parts that are not necessary here). Is there a way how to apply the FUN directly into combn function? Thanks – New2coding Aug 03 '17 at 11:12
  • I think Andrew Gustar's answer may be what you want to use in place of your `f()`. – Benjamin Aug 03 '17 at 11:18
  • @New2coding The `FUN` in the `combn` function is applied to each individual combination, and I can't think of a way of getting it to omit the combination entirely if it meets your criteria. An `ifelse` to replace it with `NULL`, for example, does not work. – Andrew Gustar Aug 03 '17 at 11:40
  • Ok, thanks for your comment. I am trying to make this piece of code work because it is a part of my other code. The thing I am trying to do is to exclude correlated variables from GLModel by excluding names of correlated headers (original post here: https://stackoverflow.com/questions/45466513/how-to-remove-correlated-variables-from-glm-in-r) – New2coding Aug 03 '17 at 12:32
1

Here is another way, generating a list of all combinations, then excluding those containing both var4 and var5...

lapply(
   lapply(1:length(mod_headers),
        function(i) combn(mod_headers, i)), 
   function(x) x[,apply(x, 2, function(y) !all(c("var4", "var5") %in% y))]) 

[[1]]
[1] "var1" "var2" "var3" "var4" "var5" "var6"

[[2]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]  [,11]  [,12]  [,13]  [,14] 
[1,] "var1" "var1" "var1" "var1" "var1" "var2" "var2" "var2" "var2" "var3" "var3" "var3" "var4" "var5"
[2,] "var2" "var3" "var4" "var5" "var6" "var3" "var4" "var5" "var6" "var4" "var5" "var6" "var6" "var6"

[[3]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]  [,11]  [,12]  [,13]  [,14]  [,15]  [,16] 
[1,] "var1" "var1" "var1" "var1" "var1" "var1" "var1" "var1" "var1" "var2" "var2" "var2" "var2" "var2" "var3" "var3"
[2,] "var2" "var2" "var2" "var2" "var3" "var3" "var3" "var4" "var5" "var3" "var3" "var3" "var4" "var5" "var4" "var5"
[3,] "var3" "var4" "var5" "var6" "var4" "var5" "var6" "var6" "var6" "var4" "var5" "var6" "var6" "var6" "var6" "var6"

...etc
Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32
0

I've only tried this on TIO, so no benchmarking, but I'd wager to bet that this version will be quicker for large sets, if that should be of importance.

m <- c("var2", "var3", "var4", "var5", "var6")
comb <- combn(m, 3)
csums <- colSums((comb == "var4") + (comb == "var5"))
comb[, csums < 2]
#      [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]  
# [1,] "var2" "var2" "var2" "var2" "var2" "var3" "var3"
# [2,] "var3" "var3" "var3" "var4" "var5" "var4" "var5"
# [3,] "var4" "var5" "var6" "var6" "var6" "var6" "var6"

Or equivalent to OP's f():

f2 <- function(m=mod_headers) {
    lapply(1:length(m), function(x) {
      comb <- combn(m, x)
      csums <- colSums((comb == "var4") + (comb == "var5"))
      comb[, csums < 2]
    })
}
AkselA
  • 8,153
  • 2
  • 21
  • 34
  • This is my solution: f <- function(){ for(i in 1:length(mod_headers)){ tab <- combn(mod_headers,i) for(j in 1:ncol(tab)){ tab_new <- c(tab[,j]) mod_tab_new <- c(tab_new, "newcol") if (all(c("var4","var5") %in% mod_tab_new)) next print(mod_tab_new) } } } f() – New2coding Aug 03 '17 at 15:51
0

This is my solution:

f <- function(){
  for(i in 1:length(mod_headers)){
      tab <- combn(mod_headers,i)
      for(j in 1:ncol(tab)){
        tab_new <- c(tab[,j])
        mod_tab_new <- c(tab_new, "newcol")
        if (all(c("var4","var5") %in% mod_tab_new)) next
        print(mod_tab_new)
    }
  }
}

f()
New2coding
  • 715
  • 11
  • 23
0

I used this webpage to reduce a list of N-Way combination given another set of N-Way combinations. Here's a slight modification of Benjamins Code.

mod_headers <- c("var1", "var2", "var3", "var4", "var5", "var6")

combn_NWayExclusion <- function(x, n, exclude){
   full <- combn(x, n); EXC<-combn(exclude, n)
   UU<-lapply(1:ncol(EXC),function(i) !apply(full, 2, function(y) all(EXC[,i] %in% y)))
   full[,!apply(do.call(rbind,UU),2,function(u){any(u=="FALSE")})]
   }

combn_NWayExclusion(mod_headers, 2, c("var4", "var5"))
emart86
  • 11
  • 1