11

I have a list of functions

funs <- list(fn1 = function(x) x^2,
             fn2 = function(x) x^3,               
             fn3 = function(x) sin(x),
             fn4 = function(x) x+1)
#in reality these are all f = splinefun()

And I have a dataframe:

mydata <- data.frame(x1 = c(1, 2, 3, 2),
                     x2 = c(3, 2, 1, 0),
                     x3 = c(1, 2, 2, 3),
                     x4 = c(1, 2, 1, 2))
#actually a 500x15 dataframe of 500 samples from 15 parameters

For each of i rows, I would like to evaluate function j on each of the j columns and sum the results:

unlist(funs)
attach(mydata)
a <- rep(NA,4)
for (i in 1:4) {
     a[i] <- sum(fn1(x1[i]), fn2(x2[i]), fn3(x3[i]), fn4(x4[i]))
}

How can I do this efficiently? Is this an appropriate occasion to implement plyr functions? If so, how?

bonus question: why is a[4] NA?

Is this an appropriate time to use functions from plyr, if so, how can I do so?

Abe
  • 12,956
  • 12
  • 51
  • 72
  • 1
    @abe for the third code snippet, you need to either `unlist(funs)` and `attach(mydata)` or use `funs$fn1` and `mydata$x1` – David LeBauer Jan 21 '11 at 23:53
  • @David thanks for the correction, I have changed the code to reflect this- but this is exactly the messiness that I would like to avoid. – Abe Jan 22 '11 at 00:00
  • 2
    Well, for the bonus point, the answer is that there is no 4th element in mydata$x4 or any of the columns of that dataframe. A further comment .. simply typing unlist(funs) does nothing unless you assign the result to something. Welcome to functional programming. – IRTFM Jan 22 '11 at 00:28
  • Note that `x1[i]` is a data frame, not a vector. You want `x1[[i]]` or `x1[, 1]` – hadley Jan 22 '11 at 01:29
  • @hadley; No, x1[1] is part of an attached data.frame and it is a numeric vector of length 1. `str(x1[1])` returns num 1 – IRTFM Jan 22 '11 at 03:30
  • Oh ooops. This is why I hate attach! – hadley Jan 22 '11 at 03:36
  • You have a typo in `data.frame` definition. Try using `dput` on dummy object you'd like to provide in a post. – aL3xa Jan 22 '11 at 18:51
  • @aL3xa ooops again, fixed - my iphone misplaced the comma while making the last edit. – Abe Jan 22 '11 at 19:11

3 Answers3

9

Ignoring your code snippet and sticking to your initial specification that you want to apply function j on the column number j and then "sum the results"... you can do:

mapply( do.call, funs, lapply( mydata, list))
#      [,1] [,2]      [,3] [,4]
# [1,]    1   27 0.8414710    2
# [2,]    4    8 0.9092974    3
# [3,]    9    1 0.9092974    3

I wasn't sure which way you want to now add the results (i.e. row-wise or column-wise), so you could either do rowSums or colSums on this matrix. E.g:

colSums( mapply( do.call, funs,  lapply( mydata, list)) )
# [1] 14.000000 36.000000  2.660066  8.000000
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Prasad Chalasani
  • 19,912
  • 7
  • 51
  • 73
  • thanks for this help; I'll use rowSums but this is the concept that I was looking for. – Abe Jan 22 '11 at 06:36
  • I don't understand what the last list does, isn't the second argument to do.call a list of arguments to the function? – Abe Jan 22 '11 at 06:48
  • I edited the second expression above slightly (you don't need to do `as.list` ). You do need to do the `lapply( mydata, list)` to turn `mydata` into a list of lists. Then the `mapply` causes `do.call` to take each function in `funs`, and takes the corresponding list-member of the `lapply(mydata,list)`, which itself is a list. – Prasad Chalasani Jan 22 '11 at 15:54
  • i just had a chance to implement this and the system.time()$elapsed was 0.02 second, down from 2.5 s when implemented as a for loop! Thanks for your help! – Abe Jan 24 '11 at 23:04
4

Why don't just write one function for all 4 and apply it to the data frame? All your functions are vectorized, and so is splinefun, and this will work:

fun <-  function(df)
    cbind(df[, 1]^2, df[, 2]^3, sin(df[, 3]), df[, 4] + 1)

rowSums(fun(mydata))

This is considerably more efficient than "foring" or "applying" over the rows.

VitoshKa
  • 8,387
  • 3
  • 35
  • 59
0

I tried using plyr::each:

library(plyr)
sapply(mydata, each(min, max))
    x1 x2 x3 x4
min  1  0  1  1
max  3  3  3  2

and it works fine, but when I pass custom functions I get:

sapply(mydata, each(fn1, fn2))
Error in proto[[i]] <- fs[[i]](x, ...) :
  more elements supplied than there are to replace

each has very brief documentation, I don't quite get what's the problem.

aL3xa
  • 35,415
  • 18
  • 79
  • 112