I am asking for help with this example. It might not be related to the pbapply
package but rather to mclapply()
. However, maybe someone has an idea whats going on here.
Problem
The only difference is the use of pblapply()
/mclapply()
instead of lapply()
(I'm on macOS). In this example, the pblapply()
approach returns NULL
while the lapply()
attempt works just fine.
The interesting point is that it only behaves like in this example in which I use lda()
. Other examples using glm()
or other modelling methods work fine.
It seems that the parallel apply call does not even start here (based on how quickly it finishes).
sperrorest()
is a generic framework (and therefore should not be the problem here) and I am confused why the example works using the sequential approach (lapply()
) but not with the parallel one (pblapply()
).
Also, doing the same with foreach()
instead of using an apply* function works fine. Hence, I assume that it is unrelated to lda()
and related to the parallel apply* call.
Notes:
The example runs on two cores so it should run on any machine.
First, please do devtools::install_github("pat-s/sperrorest@mclapply-vs-lapply")
.
Code (reproducible)
library(MASS)
library(sperrorest)
library(parallel)
library(pbapply)
currentSample <- partition.cv(maipo, nfold = 4)
currentSample[[2]] <- partition.cv(maipo, nfold = 4)[[1]]
currentRes <- currentSample
lda.predfun <- function(object, newdata, fac = NULL) {
library(nnet)
majority <- function(x) {
levels(x)[which.is.max(table(x))]
}
majority.filter <- function(x, fac) {
for (lev in levels(fac)) {
x[ fac == lev ] <- majority(x[ fac == lev ])
}
x
}
pred <- predict(object, newdata = newdata)$class
if (!is.null(fac)) pred <- majority.filter(pred, newdata[,fac])
return(pred)
}
data("maipo", package = "sperrorest")
predictors <- colnames(maipo)[5:ncol(maipo)]
fo <- as.formula(paste("croptype ~", paste(predictors, collapse = "+")))
# pblapply attempt (not working)
runreps_res <- pblapply(cl = 2, currentSample, function(X)
runreps(currentSample = X, data = maipo,
formula = fo, par.mode = 1, pred.fun = lda.predfun,
do.try = FALSE, model.fun = lda,
error.fold = TRUE, error.rep = TRUE, do.gc = 1,
err.train = TRUE, importance = FALSE, currentRes = currentRes,
pred.args = list(fac = "field"), response = "croptype", par.cl = 2,
coords = c("x", "y"), progress = 1, pooled.obs.train = c(),
pooled.obs.test = c(), err.fun = err.default))
# mclapply attempt (not working)
runreps_res <- mclapply(mc.cores = 2, currentSample, function(X)
runreps(currentSample = X, data = maipo,
formula = fo, par.mode = 1, pred.fun = lda.predfun,
do.try = FALSE, model.fun = lda,
error.fold = TRUE, error.rep = TRUE, do.gc = 1,
err.train = TRUE, importance = FALSE, currentRes = currentRes,
pred.args = list(fac = "field"), response = "croptype", par.cl = 2,
coords = c("x", "y"), progress = 1, pooled.obs.train = c(),
pooled.obs.test = c(), err.fun = err.default))
# lapply attempt (working)
runreps_res <- lapply(currentSample, function(X)
runreps(currentSample = X, data = maipo,
formula = fo, par.mode = 1, pred.fun = lda.predfun,
do.try = FALSE, model.fun = lda,
error.fold = TRUE, error.rep = TRUE, do.gc = 1,
err.train = TRUE, importance = FALSE, currentRes = currentRes,
pred.args = list(fac = "field"), response = "croptype", par.cl = 2,
coords = c("x", "y"), progress = 1, pooled.obs.train = c(),
pooled.obs.test = c(), err.fun = err.default))