1

I have a list of matrices (mat_list). I want to create a new list with a selected subset of columns from each matrix. I have another list of numerics (col_list) which indicates column numbers to keep. Example dataset:

> mat_list <- list(structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), .Dim = c(4L, 3L), .Dimnames = list(NULL, c("V1", "V2", "V3"))),structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), .Dim = c(4L, 3L), .Dimnames = list(NULL, c("V1", "V2", "V3")))) ; names(mat_list) <- c("mat1","mat2")
> mat_list
$mat1
     V1 V2 V3
[1,]  1  5  9
[2,]  2  6 10
[3,]  3  7 11
[4,]  4  8 12

$mat2
     V1 V2 V3
[1,]  1  5  9
[2,]  2  6 10
[3,]  3  7 11
[4,]  4  8 12

> col_list <- list(structure(c(1,3)),structure(c(2,3))) ; names(col_list) <- c("var1","var2")
> col_list
$var1
[1] 1 3

$var2
[1] 2 3

I would like the following output:

> my_list
$mat1
     V1 V3
[1,]  1  9
[2,]  2  10
[3,]  3  11
[4,]  4  12

$mat2
     V2 V3
[1,]  5  9
[2,]  6 10
[3,]  7 11
[4,]  8 12

I've tried to use lapply to subset these columns across all the matrices. The closest I've gotten is to do

> lapply(mat_list,function(x) x[,col_list$var1])
$mat1
     V1 V3
[1,]  1  9
[2,]  2 10
[3,]  3 11
[4,]  4 12

$mat2
     V1 V3
[1,]  1  9
[2,]  2 10
[3,]  3 11
[4,]  4 12

This uses the values from col$var1 applied over all matrices in mat_list. But I haven't been able to successfully apply this over all (both) the elements of col_list - e.g. by implementing lapply to var_list, something along

lapply(mat_list,function(x) x[,lapply(var_list)])

I'm grateful for any input.

Neuroguy
  • 131
  • 1
  • 1
  • 9

1 Answers1

2

Rather than lapply, in this case mapply fits perfectly:

mapply(function(x, y) x[, y], mat_list, col_list, SIMPLIFY = FALSE)

which is also equivalent to

Map(function(x, y) x[, y], mat_list, col_list)

Both approaches apply the specified function by taking corresponding arguments from mat_list and col_list at the same time.

The reason lapply doesn't work is that it goes only over a single variable, as you noticed. To use lapply one would instead need

lapply(seq_along(mat_list), function(i) mat_list[[i]][, col_list[[i]]])

Bonus: if mat_list contained data frames rather than matrices, one could be even more concise with

mapply(`[`, mat_list, col_list, SIMPLIFY = FALSE)
# or
Map(`[`, mat_list, col_list)
Julius Vainora
  • 47,421
  • 9
  • 90
  • 102