I am trying to optimise a code that I have written using the apply() and similar functions (e.g. lapply()). Unfortunately I do not see much of improvement so searching I came across this post apply() is slow - how to make it faster or what are my alternatives? where a suggestion is to use the function with() instead of apply() which is certainly much faster.
What I want to do is to apply a user defined function to every row of a matrix. This function takes as input the data from the row, makes some calculations and returns a vector with the results. A toy example where I use the apply() function, the with() and a vectorized version:
#Generate a matrix 10x3
prbl1=matrix(runif(30),nrow=10)
prbl2=data.frame(prbl1)
prbl3=prbl2
#function for the apply()
fn1=function(row){
x=row[1]
y=row[2]
z=row[3]
k1=2*x+3*y+4*z
k2=2*x*3*y*4*z
k3=2*x*y+3*x*z
return(c(k1,k2,k3))
}
#function for the with()
fn2=function(x,y,z){
k1=2*x+3*y+4*z
k2=2*x*3*y*4*z
k3=2*x*y+3*x*z
return(c(k1,k2,k3))
}
#Vectorise fn2
fn3=Vectorize(fn2)
#apply the functions:
rslt1=t(apply(prbl1,1,fn1))
rslt2=t(with(prbl2,fn2(X1,X2,X3)))
rslt2=cbind(rslt2[1:10],rslt2[11:20],rslt2[21:30])
rslt3=t(with(prbl3,fn3(X1,X2,X3)))
All three produce the same output, a matrix 10x3 which is what I want. Nevertheless, notice at rslt2 that I need to bind the results as the output of using with() is a vector of length 300. I suspected that this is due to the fact that the function is not vectorised (if I understood this correctly). In rslt3 I am using a vectorised version of fn2 which generated the output in the expected way.
When I compare the performance of the three, I get:
library(rbenchmark)
benchmark(rslt1=t(apply(prbl1,1,fn1)),
rslt2=with(prbl2,fn2(X1,X2,X3)),
rslt3=with(prbl3,fn3(X1,X2,X3)),
replications=1000000)
test replications elapsed relative user.self sys.self user.child sys.child
1 rslt1 1000000 103.51 7.129 102.63 0.02 NA NA
2 rslt2 1000000 14.52 1.000 14.41 0.01 NA NA
3 rslt3 1000000 123.44 8.501 122.41 0.05 NA NA
where with() without vectorisation is definitely faster.
My question: Since rslt2 is the most efficient approach, is there a way that I can use this correctly without the need to bind the results afterwards? It does the job but I feel is not efficient coding.