1

I've data.frame like below:

Name Feature
A    1
B    2
C    4 
D    1
E    7 
F    5
G    2
H    2

I have to create from this data series of data.frames with three rows in commbinationar manner. In another words, I need to obtain

subsetted_data.frame_1

Name Feature
A    1
B    2
C    4 

subsetted_data.frame_2

Name Feature
D    1
G    2
H    2

subsetted_data.frame_3

Name Feature
F    5
G    2
H    2

And so on - to finally create all possible combinations. I was tried to use split function (from data.table package), but it doesn't work. Whis is the easiest way to obtain this?

anba
  • 543
  • 1
  • 4
  • 7
  • 2
    Possible duplicate of [Subset data.table based on all possible combinations of two or more variables](https://stackoverflow.com/questions/48508342/subset-data-table-based-on-all-possible-combinations-of-two-or-more-variables) – Saurabh Chauhan Jul 20 '18 at 11:25
  • what pattern are you using to make your subsetted data frames? #1 and #3 make sense but #2 confuses me – Nate Jul 20 '18 at 21:38

1 Answers1

1

You can use combn to get a matrix of indexes and then pass them as the argument to a lapply anonymous function.

cmb <- combn(nrow(dat), 3)

sub_data <- lapply(seq_len(ncol(cmb)), function(i) dat[cmb[, i], ])
names(sub_data) <- sprintf("subsetted_data.frame_%02d", seq_along(sub_data))

EDIT.

Following @AkselA's comment I have tried his code and, if run before setting the names like the code above does, the two results are the same in the sense of identical, meaning, they are exactly the same.

sub_data2 <- apply(cmb, 2, function(x) dat[x,])
identical(sub_data, sub_data2)
#[1] TRUE

DATA in dput format.

dat <-
structure(list(Name = structure(1:8, .Label = c("A", "B", "C", 
"D", "E", "F", "G", "H"), class = "factor"), Feature = c(1L, 
2L, 4L, 1L, 7L, 5L, 2L, 2L)), .Names = c("Name", "Feature"), class = "data.frame", row.names = c(NA, 
-8L))
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • 1
    Not quite worth a separate answer, but a slightly more concise alternative to your `lapply()` line is: `sub_data <- apply(cmb, 2, function(x) dat[x,])`. – AkselA Jul 20 '18 at 11:53
  • @AkselA Thanks, I have edited the answer with your code. – Rui Barradas Jul 20 '18 at 14:47