0

I'm rewriting some old code to use a new type of parameter representation. The old version was using matrices to represent parameters, while the new is using a List and an arma::fcube. I got a performance loss that's only observable when the function is called multiple times from within another function:

The old function ConditionalProbs is almost 2 times slower than the new function called conditional_probabilities. On the other hand, multiple_times_old(which calls ConditionalProbs multiple times) is around 4 times faster than multiple_times_new, even when the only difference between this functions is the functions called multiple times.

I apologize for the long code: Here is my old code (note that I was using NumericMatrix but I did improve the speed by changing it IntegerMatrix on the new code)

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
NumericVector ConditionalProbs(NumericMatrix X, IntegerVector position,  int C, NumericMatrix cMat, NumericMatrix vMat, NumericVector V) {

  int n = X.nrow(); int m = X.ncol();
  int n_neighbors = cMat.nrow();
  int x = position[0] -1; int y = position[1] -1;
  int neix, neiy;
  NumericVector p(C + 1);
  IntegerVector vals = seq_len(C+1) - 1;
  double U;
  int dif;

  for(int value = 0; value <= C; value++){
    U = V[value];
    for(int ne=0; ne < n_neighbors; ne++){
      neix = x + cMat(ne,0); neiy = y + cMat(ne,1);
      if(neix < n && neix >=0 && neiy < m && neiy>=0){
        dif = X(neix,neiy) - vals[value] ;
        U = U + vMat(ne,dif+C);
      }
    }
    p[value] = exp(U);
  }
  p = p/sum(p);
  return(p);
}


// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
NumericMatrix multiple_times_old(NumericMatrix X, NumericMatrix cMat, NumericMatrix vMat, NumericVector V, int C, int n_times){
  NumericVector probability;
  int N = X.nrow(); int M = X.ncol();
  int x,y;
  IntegerVector position(2);
  for(int i = 0; i < n_times; i++){
    x = 2;
    y = 2;
    position[0] = x; position[1] = y;
    probability = ConditionalProbs(X, position, C, cMat, vMat, V);
  }
  return(X);
}

Now the new version is:

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
NumericVector conditional_probabilities(IntegerMatrix X, IntegerVector position, List R, arma::fcube theta){
  int n_neighbors = theta.n_rows;
  int C = theta.n_cols - 1;
  int x = position[0] - 1; int y = position[1] - 1;
  int N = X.nrow(); int M = X.ncol();

  IntegerVector this_pos;
  NumericVector probs(C+1);
  float this_prob;
  int dx, dy;

  for(int value = 0; value <= C; value++){
    this_prob = 0;
    for(int i = 0; i < n_neighbors; i++){
      this_pos = as<IntegerVector>(R[i]);
      dx = this_pos[0]; dy = this_pos[1];
      if(0 <= x+dx && x+dx < N && 0 <= y+dy && y+dy < M){
        this_prob = this_prob + theta(i, value, X(x+dx, y+dy));}
      //if(0 <= x-dx && x-dx < N && 0 <= y-dy && y-dy < M){
      // this_prob = this_prob + theta(i, X(x-dx, y-dy), value);}
    }
    probs[value] = exp(this_prob);
  }
  return(probs/sum(probs));
}

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
IntegerMatrix multiple_times_new(IntegerMatrix X, List R, arma::fcube theta, int n_times){
  NumericVector probability;
  int N = X.nrow(); int M = X.ncol();
  int x,y;
  IntegerVector position(2);
  for(int i = 0; i < n_times; i++){
    x = 2;
    y = 2;
    position[0] = x; position[1] = y;
    probability = conditional_probabilities(X, position, R, theta);
  }
  return(X);
}

I wrote benchmarks to test the functions:

library(Rcpp)
sourceCpp("bench.cpp") #All the c++ functions are on this file

# parameters for old functions
cMat <- matrix(c(1,0,0,1), nrow = 2, byrow = TRUE)
vMat <- matrix(c(-1,-1,-1,-1,1,-1,-1,-1,-1,
                 -1,-1,-1,-1,1,-1,-1,-1,-1), nrow = 2, byrow = TRUE)
vMat <- rbind(vMat, vMat) + 0.0
cMat <- rbind(cMat, -cMat)
V <- rep(0.0,5)

# parameters for new functions
R <- list(c(1L,0L), c(0L,1L), c(-1L,0L), c(0L, -1L))
theta <- array(0, c(4,5,5))
theta[1,,] <- theta[2,,] <- theta[3,,] <- theta[4,,] <- diag(rep(2,5)) - 1.0

X <- matrix(sample(0:4,64*64, replace = TRUE), nrow = 64)

library(rbenchmark)

benchmark(ConditionalProbs = ConditionalProbs(X, c(30,30), 4, cMat, vMat, V),
          conditional_probabilities = conditional_probabilities(X, c(30,30), R, theta), 
          replications = 50000)[ ,c("test", "relative","elapsed", "replications")]

                       test relative elapsed replications
2 conditional_probabilities    1.000   0.314        50000
1          ConditionalProbs    1.538   0.483        50000

benchmark(multiple_times_old = multiple_times_old(X, cMat2, vMat2, V, 4, 50000),
          multiple_times_new = multiple_times_new(X, R, theta, 50000),
          replications = 10)[ ,c("test", "relative", "elapsed", "replications")]

                test relative elapsed replications
2 multiple_times_new    4.359   0.959           10
1 multiple_times_old    1.000   0.220           10

Why is multiple_times_new slower than multiple_times_old if the function called by the first is faster than the one called by the second?

VFreguglia
  • 2,129
  • 4
  • 14
  • 35
  • 2
    Can you add the explicit function calls in R you used for your benchmarks? – Ralf Stubner Apr 07 '19 at 08:58
  • I've included them to the question, I think everything is reproducible now. They were originally in different packages, but using them via `sourceCpp` as in the question kept the results nearly unchanged. – VFreguglia Apr 07 '19 at 13:43
  • What's the definition of `init` used in the second benchmark? – Ralf Stubner Apr 07 '19 at 15:34
  • It's all `X` actually, I just messed up copying those lines from the wrong file while testing. Edited now, now it reflects the actual code used in the benchmarks. – VFreguglia Apr 07 '19 at 15:40
  • I have the impression that garbage collection is kicking in a lot when `multiple_times_new` runs. – Ralf Stubner Apr 07 '19 at 17:48
  • 1
    I'm not familiar with the concept of garbage collection, but I went reading something about and got the impression that it has to do with the use of lists and the type conversions needed with their use. Using an `IntegerMatrix` instead of `List` made them equally fast when called multiple times. When calling functions one time there is still a 2x relative difference. What a mystery. – VFreguglia Apr 07 '19 at 18:40

0 Answers0