0

I have a custom function that looks like:

f <- function(case, data, variance, limit) {
    data <- subset(data, pid != case$pid)

    # do some stuff

    out <- data.frame(
        pid=case$pid,
        comp=data$pid
    )

    return(data.frame(out[with(out, order(--distances)), ][1:limit,], rank=0:(limit-1)))
    # distances being a matrix
}

It's applied over a data frame in plyr like...

adply(subset(data$pid == 123), 1, f, data=data, variance=var(data.frame(data$foo)), limit=5)

I can't figure out how to migrate this to dplyr using (I guess) mutate(). How can I pass the current row of the data frame? I tried this..

subset(data, pid==123) %>% mutate(foo=f(data=data, variance=var(data.frame(data$foo)), limit=5))

But the error is that case is never passed... totally confused.

Wells
  • 10,415
  • 14
  • 55
  • 85
  • Two things: (i) inside a `dplyr` chain, you should be using `data=.` (that's a single dot), not `data=data` since that is using the unmodified/original data; (ii) `dplyr` defaults to expecting the first argument to be `data`, so if you redefined your function with that first then you don't need to include it in your call within `mutate`. – r2evans May 21 '15 at 22:31
  • ... and if you're planning on doing this on each row (I'm inferring), then look into using `rowwise()`, which gives you an action similar to `ddply`. – r2evans May 21 '15 at 22:32
  • Any quick example of `rowwise()` with `mutate()` and a custom function that accepts the current row as its first parameter? Or is `do()` the better option? – Wells May 21 '15 at 23:05
  • First, a change: strike (ii) above. Second, sure: `func <- function(x) x+sample(10,size=1)` and then compare the results of `mtcars %>% mutate(foo=func(carb))` with `mtcars %>% rowwise() %>% mutate(foo=func(carb))`. (Besides `data.frame` versus `tbl_df`, you'll see that in the former, all `$foo` are exactly `$carb` plus a constant, whereas it changes in the latter example using `rowwise()`. – r2evans May 21 '15 at 23:27
  • 3
    BTW: if you want a "working" answer, it would help to see representative data, since your `data` is yet undefined here. – r2evans May 22 '15 at 00:04

0 Answers0