I have 1000's of list and each list has multiple time series. I would like to apply forecasting to each element in the list. This has became an intractable problem interms of computing resources. I don't have backgrounder in parallel computing or advanced R programming. Any help would be greatly appreciated.
I have created dummy list. Basically, dat.list is similar to what I'm working on.
library("snow")
library("plyr")
library("forecast")
## Create Dummy Data
z <- ts(matrix(rnorm(30,10,10), 100, 3), start = c(1961, 1), frequency = 12)
lam <- 0.8
ap <- list(z=z,lam=lam)
## forecast using lapply
z <- ts(matrix(rnorm(30,10,10), 100, 3), start = c(1971, 1), frequency = 12)
lam <- 0.5
zp <- list(z=z,lam=lam)
dat.list <- list(ap=ap,zp=zp)
xa <- proc.time()
tt <- lapply(dat.list,function(x) lapply(x$z,function(y) (forecast::ets(y))))
xb <- proc.time()
The above code gives me what I need. I would like apply parrallel processing to both lapply in the code above. So I have attempted to use snow package and an example shown in this site.
## Parallel Processing
clus <- makeCluster(3)
custom.function <- function(x) lapply(x$z,function(y) (forecast::ets(y)))
clusterExport(clus,"custom.function")
x1 <- proc.time()
tm <- parLapply(clus,dat.list,custom.function)
x2<-proc.time()
stopCluster(clus)
Below are my questions,
- For some reason, the output of tm is differenct for the non parallel version. the forecast function ets is applied to every single data point as opposed to the element in the list.
Non parallel:
summary(tt)
Length Class Mode
ap 3 -none- list
zp 3 -none- list
Parallel Version:
summary(tm)
Length Class Mode
ap 300 -none- list
zp 300 -none- list
My second question is how should I parallelize the lapply in the custom function, basically a nested parLapply
custom.function <- function(x) parLapply(clus,x$z,function(y) (forecast::ets(y))) ## Not working
Many Thanks for your help