3

Say I have a matrix with 1000 columns. I want to create a new matrix with every other n columns from the original matrix, starting from column i.

So let say that n=3 and i=5, then the columns I need from the old matrix are 5,6,7,11,12,13,17,18,19 and so on.

lmo
  • 37,904
  • 9
  • 56
  • 69
G.N.
  • 139
  • 8

4 Answers4

8

Using two seq()s to create the start and stop bounds, then using a mapply() on those to build your true column index intervals. Then just normal bracket notation to extract from your matrix.

set.seed(1)
# using 67342343's test case
M <- matrix(runif(100^2), ncol = 100)
n <- 3
i <- 5

starts <- seq(i, ncol(M), n*2)
stops <- seq(i+(n-1), ncol(M), n*2)
col_index <- c(mapply(seq, starts, stops)) # thanks Jaap and Sotos

col_index
[1]  5  6  7 11 12 13 17 18 19 23 24 25 29 30 31 35 36 37 41 42 43 47 48 49 53 54 55 59 60 61 65 66 67 71 72 73 77 78
[39] 79 83 84 85 89 90 91 95 96 97

M[, col_index]
Nate
  • 10,361
  • 3
  • 33
  • 40
  • 2
    You can drop the `as.numeric` imo and wrap the `mapply`-call in `c()`. On a sidenote: it is better to use `TRUE` instead of `T`. – Jaap Sep 03 '17 at 11:26
  • What is the logic between T vs TRUE, I thought T was a reserved character? – Nate Sep 03 '17 at 11:29
  • @NateDay see: [*Is there anything wrong with using T and F instead of TRUE and FALSE?*](https://stackoverflow.com/questions/18256639/is-there-anything-wrong-with-using-t-and-f-instead-of-true-and-false) – Jaap Sep 03 '17 at 11:36
  • 2
    Of course wrapping it with `c()` makes the `SIMPLIFY` argument redundant – Sotos Sep 03 '17 at 11:39
5

Another solution is based on the fact that R uses index recycling:

i <- 5; n <- 3
M <- matrix(runif(100^2), ncol = 100)
id <- seq(i, ncol(M), by = 1)[rep(c(TRUE, FALSE), each = n)]
M_sub <- M[, id]
67342343
  • 816
  • 5
  • 11
4

I would write a function that determines the indices of the columns you want, and then call that function as needed.

col_indexes <- function(mat, start = 1, by = 1){
    n <- ncol(mat)
    inx <- seq(start, n, by = 2*by)
    inx <- c(sapply(inx, function(i) i:(i + by -1)))
    inx[inx <= n]
}

m <- matrix(0, nrow = 1, ncol = 20)
icol <- col_indexes(m, 5, 3)
icol
[1]  5  6  7 11 12 13 17 18 19
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
3

Here is a method using outer.

c(outer(5:7, seq(0L, 95L, 6L), "+"))
[1]  5  6  7 11 12 13 17 18 19 23 24 25 29 30 31 35 36 37 41 42 43 47 48 49 53 
[26] 54 55 59 60 61 65 66 67 71 72 73 77 78 79 83 84 85 89 90 91 95 96 97

To generalize this, you could do

idx <- c(outer(seq(i, i + n), seq(0L, ncol(M) - i, 2 * n), "+"))

The idea is to construct the initial set of columns (5:7 or seq(i, i + n)), calculate the starting points for every subsequent set (seq(0L, 95L, 6L) or seq(0L, ncol(M) - i, 2 * n)) then use outer to calculate the sum of every combination of these two vectors.

you can subset the matrix using [ like M[, idx].

lmo
  • 37,904
  • 9
  • 56
  • 69