RcppArmadillo function cpu times do not seem to scale linearly by increasing the data dimension and data size

Question

I have written a function in both R and Rcpp, which basically just takes in a data-set x, a scale parameter gamma and a vector parameter beta, and hence returns the fitted probabilities. Below is the my code in R:

tp_predict<-function(x,gamma,beta){
  n<-nrows(x)
  x<-as.matrix(cbind(1,x))
  vec<-x%*%beta
  In<-vec<0
  Ip<-!vec<0
  vecN<-vec[In]
  vecP<-vec[Ip]
  fitted<-vector(,n)
  t<-tanh(gamma)
  b<-t+1
  a<-1-t
  fitted[In]<-b/(exp(-vecN/b)+1)
  fitted[Ip]<-a/(exp(-vecP/a)+1)+t
  fitted<-as.vector(fitted)
  return(fitted)
}

In Rcpp:

arma::vec tp_predict_c(arma::mat x,double gamma,arma::vec beta){
  int n = x.n_rows;
  arma::vec one(n, fill::ones);
  arma::mat xx=join_rows(one,x);
  arma::vec v = xx * beta;
  arma::uvec In = find (v<0);
  arma::uvec Ip = find (v>=0); 
  arma::vec vecN = v.elem(In);
  arma::vec vecP = v.elem(Ip);
  double t = tanh(gamma);
  double a = 1-t;
  double b = 1+t;
  arma::vec fitted(n);
  fitted.elem(In)=b/(exp(-vecN/b)+1);
  fitted.elem(Ip)=a/(exp(-vecP/a)+1)+t;
  
  return fitted;
}

The tested data x was generated from a multi-normal distribution using rmvnorm(). In low dimensions (10 columes) and small data sizes(1000 rows), the Rcpp code works better than R code

Unit: microseconds
                           expr  min    lq    mean median    uq     max neval
   tp_predict(x = X, 0.3, Beta) 62.1 63.85 173.640   65.5 66.85 10868.9   100
 tp_predict_c(x = X, 0.3, Beta) 30.7 31.15  40.714   31.9 32.75   781.6   100

but if I increased the size to 100000 (dimension is still 10), the Rcpp code gets slower than R:

Unit: milliseconds
                           expr    min      lq     mean  median     uq      max neval
   tp_predict(x = X, 0.3, Beta) 5.8967 6.15060 8.477195 6.21560 6.7766 131.9571   100
 tp_predict_c(x = X, 0.3, Beta) 8.6547 8.82975 9.102348 9.01935 9.2005  12.6010   100

Also if I increase the dimension to 100 and size remains 1000, Rcpp function does not act as good as in low dimensions:

Unit: microseconds
                           expr   min    lq    mean median    uq   max neval
   tp_predict(x = X, 0.3, Beta) 188.7 197.0 204.745 201.85 206.5 280.5   100
 tp_predict_c(x = X, 0.3, Beta) 166.1 171.1 189.729 181.20 189.9 511.0   100

It looks like the cpu time of Rcpp function scales more rapidly than R when increasing in dimensions and sizes. I am new to Rcpp and I just directly translated my R code to Rcpp. I am not sure if I did something silly and I have no idea why this is happening. Is there anything I can do to address this?

Are you doing anything in the C++ version that R isn't already optimised for? R is pretty efficient at doing all those vectorised operations. — pseudospin, Aug 16 '20 at 22:15
Change the definition of your function to `arma::vec tp_predict_c(const arma::mat& x, double gamma, const arma::vec& beta)`. Note the `&` character, which is very important. More details in the answer below. — mtall, Aug 17 '20 at 01:39

score 0 · Answer 1 · answered Aug 17 '20 at 01:34

Your code is copying the given vector and matrix at the entry to the tp_predict_c() function, which is inefficient and not required.

By default, C++ passes objects "by value", which means "copy this object". You need to tell the compiler to avoid copies by passing each object "by reference".

To pass by reference, simply change the definition of the function to:

arma::vec tp_predict_c(const arma::mat& x, double gamma, const arma::vec& beta)

Note the const keyword and the & character in const arma::mat& and const arma::vec&.

The const tells the compiler this object won't be modified, and the & tells the compiler to pass by reference. Both of these allow extra optimizations to take place.

RcppArmadillo function cpu times do not seem to scale linearly by increasing the data dimension and data size

1 Answers1