Split and drop grouping variable

Question

I am trying to split a data frame into a list. This question was helpful, but I also want to drop the column used for grouping, since it will mess up future steps. The drop argument for split only applies to unused levels. The data frame is as follows:

structure(list(Var1 = c(-1L, -1L, -1L, -1L, -1L, -1L, -1L, -1L, 1L,
                         1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
                         1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, -1L, -1L, 
                         -1L, -1L, -1L, -1L, -1L), 
               Var2 = c(-1L, -1L, -1L, -1L, 0L, -1L, -1L, -1L, 0L, 
                        0L, 0L, -1L, -1L, -1L, -1L, -1L, -1L, -1L, 
                        -1L, -1L, -1L, -1L, -1L, -1L, -1L, -1L, -1L,
                        -1L, -1L, -1L, -1L, -1L, -1L, -1L, -1L),
               Var3 = c(1L, -1L, -1L, -1L, -1L, -1L, -1L, -1L, 0L, 0L, 
                        0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 
                        1L, 1L, 1L, 1L, 1L, 1L, -1L, -1L, -1L, -1L, 
                        -1L, -1L, -1L, -1L), 
               Var4 = c(1L, -1L, -1L, 2L, -1L, -1L, -1L, 1L, 1L, 1L, 
                        1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
                        0L, 0L, 0L, 0L, 0L, 1L, -1L, -1L, -1L, -1L, 
                        -1L, -1L, -1L, -1L), 
               Var5 = c(1L, -1L, -1L, 2L, -1L, -1L, -1L, 2L, 1L, 1L, 
                        1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 
                        0L, 0L, 1L, 1L, 1L, 1L, -1L, -1L, -1L, -1L, 
                        -1L, -1L, -1L, -1L), 
               Bin = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 6L, 6L, 
                       7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L, 11L, 
                       12L, 12L, 13L, 13L, 14L, 14L, 15L, 15L, 16L, 
                       16L, 17L, 17L, 18L, 18L)), 
           .Names = c("Var1", "Var2", "Var3", "Var4", "Var5", "Bin"), 
            class = "data.frame", row.names = c(NA, -35L))

How can I split this by "Bin" while dropping "Bin"?

If you're interested in using a package, there's `library(data.table); setDT(DF); split(DF, by="Bin", keep=FALSE)` — Frank, Mar 26 '18 at 16:10
A key concept to remember is that you cannot use the "-" (minus sign) in front of character values. That only works with numeric vectors (which can be constructed with `grep` or `which`) , and you can only use "!" with logical vectors which you construct with `grepl` or `%in%`. — IRTFM, Mar 26 '18 at 16:29

Julius Vainora · Accepted Answer · 2018-03-26T16:28:04.430

3

Depending on your information about this column, you may use

split(df[, -ncol(df)], df$Bin)

if you know that it's the last one, and

split(df[, !names(df) == "Bin"], df$Bin)

if you only know its name. Also

split(df[, -which(names(df) == "Bin")], df$Bin)

and

split(df[, -match("Bin", names(df))], df$Bin)

edited Mar 26 '18 at 16:28

answered Mar 26 '18 at 16:10

Julius Vainora

47,421
9
90
102

1

also `!grepl("Bin", names(df) )` or `-grep("Bin", names(df)` – IRTFM Mar 26 '18 at 16:31

Split and drop grouping variable

1 Answers1