0

Clumsy title, but let me explain what's in it:

My initial matrix looks like that:

kitty <- matrix(  
  c(1, 2, 4, 0, 0, 0, 0, 0, 0, 3, 1, 2), 
  nrow=3, 
  ncol=4)

returning:

X1 X2 X3 X4
1  0  0  3
2  0  0  1
4  0  0  2

I delete all columns of zeros

kitty<- kitty[, colSums(kitty != 0) > 0]

as I have to run a particular set of econometrics over the matrix which yields (complete random set of numbers here, important is that columns 2 and 3 are no longer in it and that my methodology does not allow me to name columns either):

kitty2 <- matrix(  
  c(2, 3, 4, 1, 3, 8), 
  nrow=3, 
  ncol=2)

X1 X4
2  1
3  3
4  8

What is an efficient way (I have hundreds of those matrices) to reset columns back to their initial position, filling the missing columns with NAs or 0s?

X1 X2 X3 X4
2  NA NA  1
3  NA NA  3
4  NA NA  8
Luks
  • 133
  • 11
  • `kitty` is not created with column names. Are there steps you're skipping in your sample code? – r2evans Jan 19 '18 at 18:04
  • I just used column names to make clear to the reader which columns are kept. In my code, no column names are allowed. – Luks Jan 20 '18 at 11:58

2 Answers2

1

So frankly I am not so sure that this is particularly efficient, but it is the best way I can think of to do it:

Grab the nonzero indices of kitty:

indices <- which(colSums(kitty != 0) > 0)

And then once you have kitty2, refill kitty with the values of the columns you changed.

kitty[,indices] <- kitty2
kitty
         [,1] [,2] [,3] [,4]
   [1,]    2    0    0    1
   [2,]    3    0    0    3
   [3,]    4    0    0    8

And then you could leave the columns as zeroes or change them to NA.

Walker in the City
  • 527
  • 1
  • 9
  • 22
  • 1
    `for` loops are not evil in themselves, but consider thinking in a "vectorized" mode: the column-index within `[,]` supports *1 or more* columns, so you can simplify this considerably with `kitty[,indices]`. (This actually applies to any dimension in an `array`, where a `matrix` is just a 2-dim `array`.) – r2evans Jan 19 '18 at 18:12
  • @r2evans That is a way cleaner way to do it! Thanks, I am going to edit my answer to reflect – Walker in the City Jan 19 '18 at 18:15
1

Instead of trying to recreate the removed columns, can you just assign NA from the outset?

kitty <- matrix(  
  c(1, 2, 4, 0, 0, 0, 0, 0, 0, 3, 1, 2), 
  nrow=3, 
  ncol=4)
kitty[,!(colSums(kitty) > 0)] <- NA
kitty
#      [,1] [,2] [,3] [,4]
# [1,]    1   NA   NA    3
# [2,]    2   NA   NA    1
# [3,]    4   NA   NA    2
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks, but not possible: I am running prcomp over the matrix, not allowing for any NA in the array – Luks Jan 20 '18 at 11:59