6

While R's ifelse is incredibly handy, it does have a particular shortcoming: in the call ifelse(test, yes, no) all elements of yes and no are evaluated, even those that will be thrown away.

This is rather wasteful if you're using it in the middle of a complicated numerical exercise, say in a function that will be fed to integrate, uniroot, optim or whatever. For example, one might have

ifelse(test, f(x, y, z), g(a, b, c))

where f and g are arbitrarily complex or slow functions, possibly involving more nested ifelse's.

Has anyone written a replacement for ifelse that only evaluates the elements of yes/no that will be kept? Basically, something along the lines of

out <- test
for(i in seq_along(out))
{
    if(test[i]) out[i] <- f(x[i], y[i], z[i])
    else out[i] <- g(a[i], b[i], c[i])
}

but without the clumsiness/inefficiency of an explicit loop. Is this even possible without getting into the innards of R?

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • 1
    I guess I *could* use the explicit loop, but then I feel that I'd be following the old saying that a determined programmer can write Fortran in any language.... – Hong Ooi Jan 12 '12 at 07:00
  • 1
    how about using apply? then the ifelse() is evaluated for each element, and likewise the 'yes' and 'no are only evaluated for the relevant elements, unless my understanding of apply is wrong, but I'm sure that's worked for me in the past – nzcoops Jan 12 '12 at 07:10
  • I tried using `mapply` with if()-else, and that was 10 times slower than vectorised `ifelse`! – Hong Ooi Jan 12 '12 at 12:20
  • How about returning the `call` and evaluating it later? – James Jan 12 '12 at 12:55
  • If all the values of the test are TRUE then the FALSE leg is not evaluated and if all the values of the test are FALSE then the TRUE leg is not evaluated at all so the statement in the post is not strictly so. – G. Grothendieck Jan 12 '12 at 18:19

1 Answers1

6

I don't think that the problem is ifelse. f and g are only being evaluated once in your expression. I think your problem is that f and g are slow with vectors.

You could change the calls to f and g so that they only get evaluated on a subset of the vector.

out <- numeric(length(test))  #or whatever the output type is
out[test] <- f(x[test], y[test], z[test])
out[!test] <- g(x[!test], y[!test], z[!test])

You'll need to adjust this if any elements of test are NA.

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • 4
    If `f` and `g` are at all vectorised, then this is your best option. If not then any kind of loop will do. And if you think that it's ugly code (which is fair enough) then you can always wrap this in a function and hide it in a file that you don't look at very often. – Richie Cotton Jan 12 '12 at 13:23