2

I am probably being greedy for performance, but I've observed significant performance gains when combining Rcpp and OpenMP in possibly illict ways. I understand that "Calling any of the R API from threaded code is ‘for experts only’ and strongly discouraged." but I'm don't fully understand when C++ code may be implicitly doing this on Rcpp vectors. I understand RcppParallel has the RVector class but I understand this may involve taking a copy of the vector, which may wash away any performance gains.

I note the Rcpp gallery has (https://gallery.rcpp.org/articles/gerber-statistic/) which appears to access a NumericMatrix HIST_RETURN_RAW inside a parallel loop, so it seems "some" access is allowed, but I know/believe that some wrappers (like Rcpp::List) would not be permitted. Is atomicity the distinguishing characteristic?

Concretely, are any of the following acceptable uses of OpenMP? (i.e. are they all threadsafe, compliant with R's memory management, and free from undefined behaviour?)

#include <Rcpp.h>
#ifdef _OPENMP
#include <omp.h>
#endif
using namespace Rcpp;


// 1. No 'outside' R vector, but IntegerVector created outside omp region

// [[Rcpp::export]]
IntegerVector fastInitalize(int n, int nThread = 1) {
  IntegerVector out = no_init(n);
#pragma omp parallel for num_threads(nThread)
  for (int i = 0; i < n; ++i) {
    out[i] = 0;
  }
  return out;
}


// 2. Simple access

// [[Rcpp::export]]
IntegerVector AddOMP(IntegerVector x, IntegerVector y, int nThread = 1) {
  R_xlen_t N = x.length();
  if (N != y.length()) {
    stop("Lengths differ");
  }
  IntegerVector out = no_init(N);

#pragma omp parallel for num_threads(nThread)
  for (R_xlen_t i = 0; i < N; ++i) {
    out[i] = x[i] + y[i];
  }
  return out;
}


// 3. Access, copy inside Rcpp

// [[Rcpp::export]]
IntegerVector pmax0xy(IntegerVector x, IntegerVector y, int nThread = 1) {
  R_xlen_t N = x.length();
  if (N != y.length()) {
    stop("Lengths differ");
  }
  IntegerVector out = clone(y);

#pragma omp parallel for num_threads(nThread)
  for (R_xlen_t i = 0; i < N; ++i) {
    if (x[i] > 0) {
      out[i] = 0;
    }
  }
  return out;
}


// 4. Access with omp + reduction

// [[Rcpp::export]]
int firstNonzero(IntegerVector x, int nThread = 1) {
  R_xlen_t N = x.length();
  int out = N;
#pragma omp parallel for num_threads(nThread) reduction(min : out)
  for (R_xlen_t i = 0; i < N; ++i) {
    if (x[i] != 0) {
      out = (i < out) ? i : out;
    }
  }
  return out;
}


// 5. Access with omp array reduction

// [[Rcpp::export]]
IntegerVector count_one_to_ten(IntegerVector x, int nThread = 1) {
  R_xlen_t N = x.length();
  if (N >= INT_MAX) {
    stop("Possibly too many numbers.");
  }
  const int TEN = 10;
  int numbers[TEN] = {}; // what if 10 was large?
#if defined _OPENMP && _OPENMP >= 201511
#pragma omp parallel for num_threads(nThread) reduction(+:numbers[:TEN])
#endif
  for (R_xlen_t i = 0; i < N; ++i) {
    int xi = x[i];
    if (xi < 1 || xi > 10) {
      continue;
    }
    numbers[xi - 1] += 1;
  }
  IntegerVector out(TEN);
  for (int n = 0; n < TEN; ++n) {
    out[n] = numbers[n];
  }
  return out;
}



// You can include R code blocks in C++ files processed with sourceCpp
// (useful for testing and development). The R code will be automatically
// run after the compilation.
//

/*** R
x <- sample(1:1200, size = 1e6, replace = TRUE)
y <- sample(1:1200, size = 1e6, replace = TRUE)

fastInitalize(1e6)[1]

head(AddOMP(x, y))
head(AddOMP(x, y, 2))

head(pmax0xy(x, y))
head(pmax0xy(x, y, 2))

firstNonzero(x)
firstNonzero(x, 2)

count_one_to_ten(x, 2)
*/

Hugh
  • 15,521
  • 12
  • 57
  • 100
  • 4
    The rule (as stated explicitly, if I recall, in the RcppParallel documenation, and probably more implicitly in R's own Writing R Extensions) is simple: no calls whatsoever into R's API from multithreaded code. So to be safe, no Rcpp types as they use R's memory management. You can try to play games and attempt read only access without any temporary objects etc pp but remember that the compiler may create some for you too. – Dirk Eddelbuettel May 07 '20 at 11:45
  • Usually, read-only accesses are okay. At least, I've never had any problem with them. – F. Privé May 11 '20 at 07:55

0 Answers0