6

I face the following problem: I need many subsets of a big matrix. Actually I just need views as input for another function f(), so I don't need to change the values. However it seems, that R is terribly slow for this task, or I'm doing something wrong (which seems more likely). The toy example illustrates how much time it takes just to select the columns, and then use them in another function (in this toy example the primitive function sum()). As 'benchmark' I also test the calculation time against summing up the whole matrix, which is surprisingly faster. I also experimented with the package ref, however could not achieve any performance gain. So the key question is how to subset the matrix without copying it? I appreciate any help, Thanks!

library(microbenchmark)
library(ref)

m0 <- matrix(rnorm(10^6), 10^3, 10^3)
r0 <- refdata(m0)
microbenchmark(m0[, 1:900], sum(m0[, 1:900]), sum(r0[,1:900]), sum(m0))
Unit: milliseconds
             expr       min        lq      mean    median        uq
      m0[, 1:900] 10.087403 12.350751 16.697078 18.307475 19.054157
 sum(m0[, 1:900]) 11.067583 13.341860 17.286514 19.123748 19.990661
 sum(r0[, 1:900]) 11.066164 13.194244 16.869551 19.204434 20.004034
          sum(m0)  1.015247  1.040574  1.059872  1.049513  1.067142
       max neval
 58.238217   100
 25.664729   100
 23.505308   100
  1.233617   100

The benchmark task of summing the whole matrix takes 1.059872 milliseconds and is about 16 times faster than the other functions.

maxatSOflow
  • 269
  • 3
  • 10
  • I get an error when I try your example: `Error in r0[, 1:900] : incorrect number of dimensions` – 5th Sep 02 '17 at 17:27
  • `library('ref')` fixes it – dvantwisk Sep 02 '17 at 17:31
  • OK so what's the criteria here for a solution? Does it just have to be faster than what you're doing now or would it have to be faster than summing up the whole matrix? – Hack-R Sep 02 '17 at 17:44
  • 1
    Using `Rcpp` or `compiler` would probably give you the performance gains you need – Hack-R Sep 02 '17 at 18:06
  • @Hack-R The compiler is often not that useful. – F. Privé Sep 02 '17 at 18:25
  • @F.Privé Idk, it still gave a 50% performance boost, but of course your approaches were definitely better. Compiler did even better in this very similar thread: https://stackoverflow.com/questions/30806693/make-cumulative-sum-faster Well, anyway compiler is *easier* that Rcpp but nothing is better than the first solution you put (`sum(colSums(m0)[1:900])`). – Hack-R Sep 02 '17 at 18:26

2 Answers2

5

The problem with your solution is that the subsetting is allocating another matrix, which takes times.

You have two solutions:

If the time taken with sum on the whole matrix is okay with you, you could use colSums on the whole matrix and subset the result:

sum(colSums(m0)[1:900])

Or you could use Rcpp to compute the sum with subsetting without copying the matrix.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double sumSub(const NumericMatrix& x,
              const IntegerVector& colInd) {

  double sum = 0;

  for (IntegerVector::const_iterator it = colInd.begin(); it != colInd.end(); ++it) {
    int j = *it - 1;
    for (int i = 0; i < x.nrow(); i++) {
      sum += x(i, j);
    }
  }

  return sum;
}

    microbenchmark(m0[, 1:900], sum(m0[, 1:900]), sum(r0[,1:900]), sum(m0),
                   sum(colSums(m0)[1:900]),
                   sumSub(m0, 1:900))
Unit: milliseconds
                    expr      min       lq     mean   median       uq       max neval
             m0[, 1:900] 4.831616 5.447749 5.641096 5.675774 5.861052  6.418266   100
        sum(m0[, 1:900]) 6.103985 6.475921 7.052001 6.723035 6.999226 37.085345   100
        sum(r0[, 1:900]) 6.224850 6.449210 6.728681 6.705366 6.943689  7.565842   100
                 sum(m0) 1.110073 1.145906 1.175224 1.168696 1.197889  1.269589   100
 sum(colSums(m0)[1:900]) 1.113834 1.141411 1.178913 1.168312 1.201827  1.408785   100
       sumSub(m0, 1:900) 1.337188 1.368383 1.404744 1.390846 1.415434  2.459361   100

You could use unrolling optimization to further optimize the Rcpp version.

F. Privé
  • 11,423
  • 2
  • 27
  • 78
  • Yes, this is the answer. +1. I think my comment calling it out before we implemented it deserves a +1 too tho ;) Really though, if `sum(colSums(m0)[1:900])` does that well we don't even need the more cumbersome `Rcpp`. – Hack-R Sep 02 '17 at 18:23
  • Thanks for the response. Yes I think you are abslutely right. The main issue is the allocation of memory for the copy of the matrix. Is it possible to do this in R without copying the matrix? I need the matrix subset as input for another selfwritten function, the sum() function just deals as example here. – maxatSOflow Sep 02 '17 at 18:49
  • Basically, I make all my functions so that they can opper on a subset of a matrix. If you don't want to make a copy, you'll need only functions like this. – F. Privé Sep 02 '17 at 20:45
0

Using compiler I wrote a function that gets the result about 2x as fast as your other methods (8x that of sum(m0) instead of 16x):

require(compiler)

compiler_sum <- cmpfun({function(x) {
     tmp <- 0
     for (i in 1:900)
         tmp <- tmp+sum(x[,i])
     tmp
}})

microbenchmark( 
               sum(m0),
               compiler_sum(m0)
               )
Unit: milliseconds
             expr      min       lq     mean   median      uq       max
          sum(m0) 1.016532 1.056030 1.107263 1.084503 1.11173  1.634391
 compiler_sum(m0) 7.655251 7.854135 8.000521 8.021107 8.29850 16.760058
 neval
   100
   100
Hack-R
  • 22,422
  • 14
  • 75
  • 131