2

As I was exploring Rcpp I came to realization that the following swap function

// swap.cpp
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void swap(NumericVector x) {
  double tmp = x[0];
  x[0] = x[1];
  x[1] = tmp;
}

does not perform the swap when passed an integer vector. For example,

x <- 1:2
str(x)
# int [1:2] 1 2
swap(x)
x
# [1] 1 2

However,

y <- c(1,2)
str(y)
# num [1:2] 1 2
swap(y)
y
# [1] 2 1

works fine. My suspicion is that when swap is passed an integer vector x it is forced to make a copy of x that is converted to a NumericVector. Then anything performed on the copy of x does not effect the original variable that was passed. Is this reasoning correct? If so, why does the conversion have to result in a copy? Is there a way to write a more robust swap function in which we wouldn't have to worry about accidentally passing an integer vector when we should be passing a numeric vector?

I apologize if this question has been asked before, but I could not find a suitable answer.

EDIT:

The code below does indeed show that a copy of the object is made when an integer vector is passed to swap instead of a numeric vector.

// [[Rcpp::export]]
void where(SEXP x) {
  Rcout << x << std::endl;
}

// [[Rcpp::export]]
void swap(NumericVector x) {
  double tmp = x[0];
  x[0] = x[1];
  x[1] = tmp;
  Rcout << "During swap function: " << x << std::endl;
}

/*** R
test_swap <- function(x) {
  cat("Before the swap function: ") 
  cat(where(x))
  swap(x)
  cat("After the swap function: ") 
  cat(where(x))
}

y <- c(1, 2) // type num
x <- 1:2 // type int

test_swap(y) // swap works because type matches function
#> Before the swap function: 0x116017bf8
#> During swap function: 0x116017bf8
#> After the swap function: 0x116017bf8

test_swap(x) // swap does not work because type does not match function
#> Before the swap function: 0x10d88e468
#> During swap function: 0x116015708
#> After the swap function: 0x10d88e468
*/
petew
  • 671
  • 8
  • 13
  • 1
    You're right, the conversion is what produces a copy, making it free of side-effect. One strength (of many) of Rcpp is that it does not *mandate* a fresh copy of the data in the function. If you want to force a copy, you can use `y = clone(x)` and operate on `y`. You can read [these previous SO questions](http://stackoverflow.com/search?q=[rcpp]+clone) about the subject. – r2evans Jun 10 '15 at 16:17
  • Thanks @r2evans, but I am still unclear on how to write a more robust swap function in this case where we wouldn't run into this problem. – petew Jun 10 '15 at 16:34
  • 2
    As long as you are using `NumericVector x`, it will always either (a) be copied for you, or (b) not be copied for you, all before your code gets to see anything. You need to make it a generic using templates or `SEXP`, both of which are shown [here](http://gallery.rcpp.org/articles/fast-factor-generation/). – r2evans Jun 10 '15 at 16:40
  • [This previous SO question](http://stackoverflow.com/questions/19823915/how-can-i-handle-vectors-without-knowing-the-type-in-rcpp) may also help. – r2evans Jun 10 '15 at 16:41
  • @r2evans I would accept that as the correct answer – petew Jun 10 '15 at 17:06

1 Answers1

1

Building on @r2evans' comments, here's a minimal implementation:

#include <Rcpp.h>

template <int T>
void swap_templ(Rcpp::Vector<T> x) {
  double tmp = x[0];
  x[0] = x[1];
  x[1] = tmp;
}
// [[Rcpp::export]]
void swap(SEXP x) {
  switch (TYPEOF(x)) {
  case INTSXP: 
    swap_templ<INTSXP>(x);
    break;
  case REALSXP:
    swap_templ<REALSXP>(x);
    break;
  default:
    Rcpp::Rcout <<
      "\nInput vector must be numeric or integer type" <<
      std::endl;
    break;
  }
}

/*** R
iv <- 1L:3L
dv <- 1:3 + 0.5

R> class(iv)
[1] "integer"

R> class(dv)
[1] "numeric"

R> swap(iv); iv
[1] 2 1 3

R> swap(dv); dv
[1] 2.5 1.5 3.5

R> class(iv)
[1] "integer"

R> class(dv)
[1] "numeric"
*/
nrussell
  • 18,382
  • 4
  • 47
  • 60