14

Suppose I have a vector, vec, which is long (starting at 1E8 entries) and would like to bound it to the range [a,b]. I can certainly code vec[vec < a] = a and vec[vec > b] = b, but this requires two passes over the data and a large RAM allocation for the temporary indicator vector (~800MB, twice). The two passes burn time because we can do better if we copy data from main memory to the local cache just once (calls to main memory are bad, as are cache misses). And who knows how much this could be improved with multiple threads, but let's not get greedy. :)

Is there a nice implementation in base R or some package that I'm overlooking, or is this a job for Rcpp (or my old friend data.table)?

Iterator
  • 20,250
  • 12
  • 75
  • 111

2 Answers2

13

A naive C solution is

library(inline)

fun4 <-
    cfunction(c(x="numeric", a="numeric", b="numeric"), body4,
              language="C")
body4 <- "
    R_len_t len = Rf_length(x);
    SEXP result = Rf_allocVector(REALSXP, len);
    const double aa = REAL(a)[0], bb = REAL(b)[0], *xp = REAL(x);
    double *rp = REAL(result);

    for (int i = 0; i < len; ++i)
        if (xp[i] < aa)
            rp[i] = aa;
        else if (xp[i] > bb)
            rp[i] = bb;
        else
            rp[i] = xp[i];

    return result;
"
fun4 <-
    cfunction(c(x="numeric", a="numeric", b="numeric"), body4,
              language="C")

With a simple parallel version (as Dirk points out, this is with CFLAGS = -fopenmp in ~/.R/Makevars, and on a platform / compiler supporting openmp)

body5 <- "
    R_len_t len = Rf_length(x);
    const double aa = REAL(a)[0], bb = REAL(b)[0], *xp = REAL(x);
    SEXP result = Rf_allocVector(REALSXP, len);
    double *rp = REAL(result);

#pragma omp parallel for
    for (int i = 0; i < len; ++i)
        if (xp[i] < aa)
            rp[i] = aa;
        else if (xp[i] > bb)
            rp[i] = bb;
        else
            rp[i] = xp[i];

    return result;
"
fun5 <-
    cfunction(c(x="numeric", a="numeric", b="numeric"), body5,
              language="C")

And benchmarks

> z <- runif(1e7)
> benchmark(fun1(z,0.25,0.75), fun4(z, .25, .75), fun5(z, .25, .75),
+           replications=10)
                 test replications elapsed  relative user.self sys.self
1 fun1(z, 0.25, 0.75)           10   9.087 14.609325     8.335    0.739
2 fun4(z, 0.25, 0.75)           10   1.505  2.419614     1.305    0.198
3 fun5(z, 0.25, 0.75)           10   0.622  1.000000     2.156    0.320
  user.child sys.child
1          0         0
2          0         0
3          0         0
> identical(res1 <- fun1(z,0.25,0.75), fun4(z,0.25,0.75))
[1] TRUE
> identical(res1, fun5(z, 0.25, 0.75))
[1] TRUE

on my quad-core laptop. Assumes numeric input, no error checking, NA handling, etc.

Martin Morgan
  • 45,935
  • 7
  • 84
  • 112
  • +1 I'd like this function in core R, called something like `clamp(x, low, high)`... Well one can always wish, right ;-) – Tommy May 07 '12 at 07:19
  • +1 for OpenMP, but I think you need to modify PKG_CFLAGS etc to get `-fopenmp`. Or did you do so somewhere else, eg in `~/.R/Makevars` ? – Dirk Eddelbuettel May 07 '12 at 12:00
  • @DirkEddelbuettel R's configure.ac detects OpenMP; `-fopenmp` is set in R_HOME/etc/Makeconf. – Martin Morgan May 07 '12 at 12:40
  • Not on my machine, I get `warning: ignoring #pragma omp parallel [-Wunknown-pragmas]` from your example. Even though I have `-fopenmp` in `/etc/R/Makeconf` (which is a symlink to the location below `R_HOME`). – Dirk Eddelbuettel May 07 '12 at 12:54
  • @DirkEddelbuettel yes you're right, ~/.R/Makevars contains `CFLAGS = -fopenmp` – Martin Morgan May 07 '12 at 14:00
3

Just to start things off: not much difference between your solution and the pmin/pmax solution (trying things out with n=1e7 rather than n=1e8 because I'm impatient) -- pmin/pmax is actually marginally slower.

fun1 <- function(x,a,b) {x[x<a] <- a; x[x>b] <- b; x}
fun2 <- function(x,a,b) pmin(pmax(x,a),b)
library(rbenchmark)
z <- runif(1e7)

benchmark(fun1(z,0.25,0.75),fun2(z,0.25,0.75),rep=50)

                 test replications elapsed relative user.self sys.self
1 fun1(z, 0.25, 0.75)           10  21.607  1.00000     6.556   15.001
2 fun2(z, 0.25, 0.75)           10  23.336  1.08002     5.656   17.605
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Interesting. I was hopeful that that would be faster, but it seems no such luck. – Iterator May 06 '12 at 23:17
  • `fun2` is about 20% faster for me on R version 2.15.0 Patched (2012-05-01 r59304) Platform: x86_64-unknown-linux-gnu (64-bit) compiled with CFLAGS=-O0; the hack `.Internal(pmin(FALSE, x, a))` etc is about 30% faster than `fun1`. – Martin Morgan May 07 '12 at 02:29