14

I am trying to convert RcppArmadillo vector (e.g. arma::colvec) to a Rcpp vector (NumericVector). I know I can first convert arma::colvec to SEXP and then convert SEXP to NumericVector (e.g. as<NumericVector>(wrap(temp)), assuming temp is an arma::colvec object). But what is a good way to do that?

I want to do that simply because I am unsure if it is okay to pass arma::colvec object as a parameter to an Rcpp::Function object.

Artem Klevtsov
  • 9,193
  • 6
  • 52
  • 57
Raymond Wong
  • 149
  • 1
  • 4
  • 3
    what happened when you tried it ? Trying something is a good way of testing if if works and when it does not work, often the compiler tells you why. – Romain Francois Jan 10 '13 at 11:13
  • 1
    And if and when it works, you could even time it and compare different ways of doing it... – Dirk Eddelbuettel Jan 10 '13 at 18:19
  • 'as(wrap(temp)' has been the way that I used. It is compiled without any error and returns correct answer. But recently, when I put my function to a unix cluster for running big simulation. I encounter some errors like: – Raymond Wong Jan 10 '13 at 18:41
  • Sorry I mistyped an enter and passed the 5 mins for editing... My comments continue as: I encounter some errors like:'Rcpp::eval_error in eval(expr, envir, enclos): unused argument(s) (error = function (e)', 'Rcpp::eval_error in eval(expr, envir, enclos): promise already under evaluation: recursive default argument reference or earlier problems?', ... My entire program performs deterministic job, but when I reran the exact same codes without the above error. So it confuses me. 'as(wrap(temp)' is the part that I am curious if I am right to do that. – Raymond Wong Jan 10 '13 at 18:57
  • What versions of R, Rcpp and RcppArmadillo are installed on the cluster? – Dirk Eddelbuettel Jan 11 '13 at 17:30
  • R (2.15.2), Rcpp (0.10.2) and RcppArmadillo (0.3.6.1). I also found that the problem seems to only occur for those R sessions started in the slave node through slurm. – Raymond Wong Jan 12 '13 at 06:58
  • Almost surely due to slurm invoking different versions. – Dirk Eddelbuettel Jan 12 '13 at 20:07

3 Answers3

9

I was trying to Evaluate a Rcpp::Function with argument arma::vec, it seems that it takes the argument in four forms without compilation errors. That is, if f is a Rcpp::Function and a is a arma::vec, then

  1. f(a)
  2. f(wrap(a))
  3. f(as<NumericVector>(wrap(a)))
  4. f(NumericVector(a.begin(),a.end()))

produce no compilation and runtime errors, at least apparently.

For this reason, I have conducted a little test for the four versions of arguments. Since I suspect that somethings will go wrong in garbage collection, I test them again gctorture.

gctorture(on=FALSE)
Rcpp::sourceCpp(code = '
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp;

// [[Rcpp::export]]
double foo1(arma::vec a, arma::vec b, Function f){
    double sum = 0.0;
    for(int i=0;i<100;i++){
        sum += as<double>(f(a, b));
    }
    return sum;
}

// [[Rcpp::export]]
double foo2(arma::vec a, arma::vec b, Function f){
    double sum = 0.0;
    for(int i=0;i<100;i++){
        sum += as<double>(f(wrap(a),wrap(b)));
    }
    return sum;
}

// [[Rcpp::export]]
double foo3(arma::vec a, arma::vec b, Function f){
    double sum = 0.0;
    for(int i=0;i<100;i++){
        sum += as<double>(f(as<NumericVector>(wrap(a)),as<NumericVector>(wrap(b))));
    }
    return sum;
}

// [[Rcpp::export]]
double foo4(arma::vec a, arma::vec b, Function f){
    double sum = 0.0;
    for(int i=0;i<100;i++){
        sum += as<double>(f(NumericVector(a.begin(),a.end()),NumericVector(b.begin(),b.end())));
    }
    return sum;
}
')
# note that when gctorture is on, the program will be very slow as it
# tries to perfrom GC for every allocation.
# gctorture(on=TRUE)
f = function(x,y) {
    mean(x) + mean(y)
}
# all three functions should return 700
foo1(c(1,2,3), c(4,5,6), f) # error
foo2(c(1,2,3), c(4,5,6), f) # wrong answer (occasionally)!
foo3(c(1,2,3), c(4,5,6), f) # correct answer
foo4(c(1,2,3), c(4,5,6), f) # correct answer

As a result, the first method produces an error, the second method produces a wrong answer and only the third and the fourth method return the correct answer.

> # they should return 700
> foo1(c(1,2,3), c(4,5,6), f) # error
Error: invalid multibyte string at '<80><a1><e2>'
> foo2(c(1,2,3), c(4,5,6), f) # wrong answer (occasionally)!
[1] 712
> foo3(c(1,2,3), c(4,5,6), f) # correct answer
[1] 700
> foo4(c(1,2,3), c(4,5,6), f) # correct answer
[1] 700

Note that, if gctorture is set FALSE, then all functions will return a correct result.

> foo1(c(1,2,3), c(4,5,6), f) # error
[1] 700
> foo2(c(1,2,3), c(4,5,6), f) # wrong answer (occasionally)!
[1] 700
> foo3(c(1,2,3), c(4,5,6), f) # correct answer
[1] 700
> foo4(c(1,2,3), c(4,5,6), f) # correct answer
[1] 700

It means that method 1 and method 2 are subjected to break when garbage is collected during runtime and we don't know when it happens. Thus, it is dangerous to not wrap the parameter properly.

Edit: as of 2017-12-05, all four conversions produce the correct result.

  1. f(a)
  2. f(wrap(a))
  3. f(as<NumericVector>(wrap(a)))
  4. f(NumericVector(a.begin(),a.end()))

and this is the benchmark

> microbenchmark(foo1(c(1,2,3), c(4,5,6), f), foo2(c(1,2,3), c(4,5,6), f), foo
3(c(1,2,3), c(4,5,6), f), foo4(c(1,2,3), c(4,5,6), f))
Unit: milliseconds
                            expr      min       lq     mean   median       uq
 foo1(c(1, 2, 3), c(4, 5, 6), f) 2.575459 2.694297 2.905398 2.734009 2.921552
 foo2(c(1, 2, 3), c(4, 5, 6), f) 2.574565 2.677380 2.880511 2.731615 2.847573
 foo3(c(1, 2, 3), c(4, 5, 6), f) 2.582574 2.701779 2.862598 2.753256 2.875745
 foo4(c(1, 2, 3), c(4, 5, 6), f) 2.378309 2.469361 2.675188 2.538140 2.695720
      max neval
 4.186352   100
 5.336418   100
 4.611379   100
 3.734019   100

And f(NumericVector(a.begin(),a.end())) is marginally faster than other methods.

Randy Lai
  • 3,084
  • 2
  • 22
  • 23
  • Let's just think this through for a second. You are using `eval()` to let R evaluate an object managed by Armadillo data structures, and you are relying on implicit conversion. I would at a minimum explicitly pass from Armadillo data structure to Rcpp data structures. Besides, `eval()` is a very last resort which is almost more or less foregoing the gains of operating in C++ in the first place. – Dirk Eddelbuettel Mar 12 '14 at 11:24
  • It is true that writing a R function and passing it to Rcpp performs much worse than writing a c function. But it is sometimes necessary to allow user defined functions, an example would be approximating first derivative of a given function. – Randy Lai Mar 12 '14 at 14:00
  • 2
    I appreciate that you agree that it is a bad idea that ought to be avoided. Now on to my second point: if you really think you must call R from C(++), do it with an R data structure. – Dirk Eddelbuettel Mar 12 '14 at 14:11
6

This should works with arma::vec, arma::rowvec and arma::colvec:

template <typename T>
Rcpp::NumericVector arma2vec(const T& x) {
    return Rcpp::NumericVector(x.begin(), x.end());
}
Artem Klevtsov
  • 9,193
  • 6
  • 52
  • 57
1

I had the same question. I used wrap to do the conversion at the core of several layers of for loops and it was very slow. I think the wrap function is to blame for dragging the speed down so I wish to know if there is an elegant way to do this.

As for Raymond's question, you might want to try including the namespace like: Rcpp::as<Rcpp::NumericVector>(wrap(A)) instead or include a line using namespace Rcpp; at the beginning of your code.

Clavichord
  • 75
  • 5