I'm writing for my package on CRAN, on the way to optimize the speed.
I've seen one main problem, which is that the "base" (stats actually) methods for time series are quite slow, especially when you work with same tsp.
set.seed(1)
a <- ts(rnorm(480),start=2010,freq=12)
b <- ts(rnorm(480),start=2010,freq=12)
library(microbenchmark)
ts_fastop <- function(x,y,FUN) {
FUN <- match.fun(FUN)
tspx <- tsp(x)
if (any(abs(tspx - tsp(y)) > getOption("ts.eps"))) stop("This method is only made for similar tsp", call. = FALSE)
ts(FUN(as.numeric(x),as.numeric(y)),start=tspx[1L],frequency = tspx[3L])
}
identical(ts_fastop(a,b,`+`),a+b)
# [1] TRUE
microbenchmark(ts_fastop(a,b,`+`),a+b,times=1000L)
# Unit: microseconds
# expr min lq mean median uq max neval
# ts_fastop(a, b, `+`) 13.7 15.3 24.1260 17.4 18.9 6666.4 1000
# a + b 364.5 372.5 385.7744 375.6 380.4 7218.4 1000
I think that 380 microseconds, for a simple +
on a few vars, is way too much.
However, as I was shortcuting these methods, I wonder what's the best practices :
- if anyone shortcuts main functions, I guess it makes it less easy for R core team to manage upgrades
- the readability of the source is better if it is written a+b than ts_fastop(a,b,
+
)
So what is anything advised regarding that ?
Thanks