14

I want to replicate the following R function in Rcpp:

fR = function(x) x[1:2]

fR(c(1,2,3))
#[1] 1 2
fR(c('a','b','c'))
#[1] "a" "b"

I can do it for a fixed output type like so:

library(inline)
library(Rcpp)

fint = cxxfunction(signature(x = "SEXP"), '
          List xin(x);
          IntegerVector xout;

          for (int i = 0; i < 2; ++i) xout.push_back(xin[i]);

          return xout;', plugin = "Rcpp")

But this will only work for integers, and if I try replacing the xout type with List (or GenericVector, which are the same) - it works with any input type, but I get back a list instead of a vector.

What's the correct Rcpp way of doing this?

eddi
  • 49,088
  • 6
  • 104
  • 155

2 Answers2

12

Don't use push_back on Rcpp types. The way Rcpp vectors are currently implemented this requires copying all of the data each time. This is a very expensive operation.

We have RCPP_RETURN_VECTOR for dispatching, this requires that you write a template function taking a Vector as input.

#include <Rcpp.h>
using namespace Rcpp ;

template <int RTYPE>
Vector<RTYPE> first_two_impl( Vector<RTYPE> xin){
    Vector<RTYPE> xout(2) ;
    for( int i=0; i<2; i++ ){
        xout[i] = xin[i] ;    
    }
    return xout ;
}

// [[Rcpp::export]]
SEXP first_two( SEXP xin ){
  RCPP_RETURN_VECTOR(first_two_impl, xin) ;
}

/*** R
    first_two( 1:3 )
    first_two( letters )
*/

Just sourceCpp this file, this will also run the R code which calls the two functions. Actually, the template could be simpler, this would work too:

template <typename T>
T first_two_impl( T xin){
    T xout(2) ;
    for( int i=0; i<2; i++ ){
        xout[i] = xin[i] ;    
    }
    return xout ;
}

The template parameter T only needs:

  • A constructor taking an int
  • An operator[](int)

Alternatively, this might be a job for dplyr vector visitors.

#include <dplyr.h>
// [[Rcpp::depends(dplyr,BH)]]

using namespace dplyr ;
using namespace Rcpp ;

// [[Rcpp::export]]
SEXP first_two( SEXP data ){
    VectorVisitor* v = visitor(data) ;
    IntegerVector idx = seq( 0, 1 ) ;
    Shield<SEXP> out( v->subset(idx) ) ;
    delete v ;
    return out ;
}

visitors let you do a set of things on a vector regardless of the type of data it holds.

> first_two(letters)
[1] "a" "b"

> first_two(1:10)
[1] 1 2

> first_two(rnorm(10))
[1] 0.4647190 0.9790888
Romain Francois
  • 17,432
  • 3
  • 51
  • 77
  • 1
    Thanks Romain. This certainly answers the OP, but more generally I was expecting Rcpp to have a `Vector` class that would take an arbitrary vector-type SEXP pointer in the constructor and then would do appropriate operations, like subset or assignment and whatnot on that, doing the appropriate dispatches internally. It doesn't need me to tell it what the class of the SEXP pointer is by explicitly writing via the templates. I do get that when you specify class explicitly via the templates you avoid all the if/else's, but sometimes the user just doesn't know the class at compile time. – eddi Nov 07 '13 at 19:56
  • Sorry, it can't be done this way. Having a type system like Rcpp instead of a catch all and therefore useless type like SEXP is a good thing and a design decision. – Romain Francois Nov 07 '13 at 20:58
  • I don't think any of what I said disagreed with that. Had R internally done the same I'd be very happy, but since R doesn't do it and Rcpp is supposed to make life easier, imo it would make sense to have a universal vector type in Rcpp. – eddi Nov 07 '13 at 21:00
  • I think you still misunderstand. R is written in C. C has static types. A SEXP is simply an old-school union type, and you always have to branch anyway. Romain is simply hiding that here; look at what RCPP_RETURN_VECTOR does. And no, you can't do it differently given the system you are working in. – Dirk Eddelbuettel Nov 07 '13 at 22:25
  • @RomainFrancois That is a comment which ignores the historical context: When SEXP were defined, C++ did not exist in the form you enjoy today. So no, it couldn't really be done differently. But anyway, Rcpp allows us all to hide the tediousness behind a nicer access layer. – Dirk Eddelbuettel Nov 07 '13 at 22:28
  • 3
    @eddi I've added an alternative using dplyr's vector visitor. Might be what you want. – Romain Francois Nov 08 '13 at 06:11
  • 1
    Thanks Romain. I'll take a look at dplyr, but independently of that I suggest adding arbitrary number of arguments to your `RCPP_RETURN_VECTOR` macro (which I had to do for my real use case). – eddi Nov 08 '13 at 13:37
  • 1
    Please submit a pull request here: https://github.com/RcppCore/Rcpp or send us a patch or start a discussion in our issue tracker: https://github.com/RcppCore/Rcpp/issues – Romain Francois Nov 08 '13 at 14:47
  • @Romain: That is nice. But why does it need `delete` when there is no `new`? Does the ctor not do RAII? That looks different from the standard paradigms. – Dirk Eddelbuettel Nov 08 '13 at 15:14
  • Oh, found it at the bottom of `VectorVisitorImpl.h`. Hm. Could one not wrap something scoped around it? Requiring users to add delete feels a little old fashioned. But I presume you thought that through... – Dirk Eddelbuettel Nov 08 '13 at 15:24
  • Sure. In places I'm using `boost::scoped_ptr`. – Romain Francois Nov 08 '13 at 16:08
  • FR added, and I took a quick look at `dplyr` and at first sight that looks much better than the default Rcpp options - I'll play around more with that – eddi Nov 08 '13 at 16:15
  • 1
    @eddi How did you add the capability to take an arbitrary number of arguments to `RCPP_RETURN_VECTOR`? I'm very interested in doing this myself but have VERY little c++ experience. – stanekam Sep 22 '14 at 03:57
0

You need to pick a type (ie do not use signature="SEXP" [ oh and you should look into Attributes anyway ]).

Or you keep the SEXP type, and dispatch internally. See for example this post on the Rcpp Gallery.

Edit: And C is of course statically typed. These very switches depending on the type are all over the R sources too. No free lunch here.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • I don't know the type before hand. I'm learning to use Rcpp and tbh find it surprising that I'd need to do the whole if/else construction for something as simple as the above - my main reason for attempting to use Rcpp is to *not* do that stuff :-\ – eddi Nov 06 '13 at 22:36
  • 2
    You seem to overlook the fact that C++ is a strongly typed language. – Dirk Eddelbuettel Nov 06 '13 at 22:36
  • I just noticed Rcpp sugar mentions having a `head` function, so perhaps this *is* doable without if/else's? – eddi Nov 06 '13 at 22:39
  • That uses template programming which is a lot more involved. – Dirk Eddelbuettel Nov 06 '13 at 22:43
  • I like involved... ? – eddi Nov 06 '13 at 22:48
  • 2
    A similar post that uses a template and a wrapper function, rather than in-line dispatch: http://gallery.rcpp.org/articles/fast-factor-generation/. – Kevin Ushey Nov 07 '13 at 00:36