1

This is my first time asking a question here so please let me know if I need to change the way I am doing this. I have been looking for awhile and I haven't been able to find what I need.

I have a list of 3 dataframes. They have the same structure (variables) but not the same number of observations. I would like to get several subsets for each dataframe in my list, according to several conditions stored in a vector.

So if I have 5 conditions, I need to get, for each of the 3 dataframes in my list, 5 subsets of these dataframes, so 15 total.

For instance:

df1 <-data.frame(replicate(3,sample(0:10,10,rep=TRUE)))
df2 <-data.frame(replicate(3,sample(0:10,7,rep=TRUE)))
df3 <-data.frame(replicate(3,sample(0:10,8,rep=TRUE)))

my_list <- list(df1, df2, df3)

conditions <- c(2, 5, 7, 4, 6)

I know how to subset for one of the conditions using lapply

list_subset <- lapply(my_list, function(x) x[which(x$X1 == conditions[1]), ])

But I would like to do that for all the values in the vector conditions. I hope it makes sense.

The Governor
  • 302
  • 1
  • 9

2 Answers2

1

Just lapply again, this time over the conditions:

df1 <-data.frame(replicate(3,sample(0:10,10,rep=TRUE)))
df2 <-data.frame(replicate(3,sample(0:10,7,rep=TRUE)))
df3 <-data.frame(replicate(3,sample(0:10,8,rep=TRUE)))

my_list <- list(df1, df2, df3)

conditions <- c(2, 5, 7, 4, 6)

list_subset <- lapply(my_list, function(x) x[which(x$X1 == conditions[1]), ])

#One Way, Conditions on first list
list.of.list_subsets <- lapply(conditions,function(y){
  lapply(my_list, function(x) x[which(x$X1 == y), ])
})
#The other way around
list.of.list_subsets2 <- lapply(my_list,function(x){
  lapply(conditions, function(y) x[which(x$X1 == y), ])
})
Julian_Hn
  • 2,086
  • 1
  • 8
  • 18
1

An option would be to filter with %in% and then split based on the 'X1' column

lapply(my_list, function(x) {x1 <- subset(x, X1 %in% conditions); split(x1, x1$X1)})
akrun
  • 874,273
  • 37
  • 540
  • 662