Here's one possible approach:
#include <Rcpp.h>
Rcpp::LogicalVector logical_index(Rcpp::IntegerVector idx, R_xlen_t n) {
bool invert = false;
Rcpp::LogicalVector result(n, false);
for (R_xlen_t i = 0; i < idx.size(); i++) {
if (!invert && idx[i] < 0) invert = true;
result[std::abs(idx[i])] = true;
}
if (!invert) return result;
return !result;
}
// [[Rcpp::export]]
Rcpp::NumericVector
Subset(Rcpp::NumericVector x, Rcpp::IntegerVector idx) {
return x[logical_index(idx, x.size())];
}
x <- seq(2, 10, 2)
x[c(2, 4)]
#[1] 4 8
Subset(x, c(1, 3))
#[1] 4 8
x[-c(2, 4)]
#[1] 2 6 10
Subset(x, -c(1, 3))
#[1] 2 6 10
Note that the indices for the Rcpp function are 0-based, as they are processed in C++.
I abstracted the subsetting logic into its own function, logical_index
, which converts an IntegerVector
to a LogicalVector
in order to be able to "decide" whether to drop or keep the specified elements (e.g. by inverting the result). I suppose this could be done with integer-based subsetting as well, but it should not matter either way.
Like vector subsetting in R, a vector of all negative indices means to drop the corresponding elements; whereas a vector of all positive indices indicates the elements to keep. I did not check for mixed cases, which should probably throw an exception, as R will do.
Regarding my last comment, it would probably be more sensible to rely on Rcpp's native overloads for ordinary subsetting, and have a dedicated function for negated subsetting (R's x[-c(...)]
construct), rather than mixing functionality as above. There are pre-existing sugar expressions for creating such a function, e.g.
#include <Rcpp.h>
template <int RTYPE>
inline Rcpp::Vector<RTYPE>
anti_subset(const Rcpp::Vector<RTYPE>& x, Rcpp::IntegerVector idx) {
Rcpp::IntegerVector xi = Rcpp::seq(0, x.size() - 1);
return x[Rcpp::setdiff(xi, idx)];
}
// [[Rcpp::export]]
Rcpp::NumericVector
AntiSubset(Rcpp::NumericVector x, Rcpp::IntegerVector idx) {
return anti_subset(x, idx);
}
/*** R
x <- seq(2, 10, 2)
x[-c(2, 4)]
#[1] 2 6 10
AntiSubset(x, c(1, 3))
#[1] 2 6 10
*/