0

Hello there I am having a problem using foreach and doparallel where my loop is giving me and error that it cannot find some of my list variables.

I have 3 lists in this loop that I have created seperately outside.

  1. models.list This is a list of 527 formulas which contain all possible combinations of variables I wish to test.

  2. training.sets This is a list of 25 different partitions of my data, used for training

  3. testdatas This is a list of 25 different partitions of my data, used for testing

  4. testdatas2 A copy of the testdatas list. Used to overwrite testdatas when a new loop starts

Basically I am writing a code which cross-validates each model 25 times, and runs a prediction whos result predicts if an area is "hot: 1" or "not : 0" Then based on the true or not value and the values from the test data set it computes the number of True Positives, False Positives, and False Negatives, it sends this through to the outer loop and then returns the PPV and Sensitivity based on the mean of all 25 cross-validations.

I have tried using the .export command, however my code still doesnt seem able to find models.list[[l]]. It says it doesnt exist

Any tips, or a solution would be massively appreciated! Im sure there is a lot I could be doing better here, but I've been working on it all day just trying to get it to run. Not sure if I should be using a foreach inner loop or a normal loop either.

I am not sure if I can call my lists into the local environment from the global environment somehow? Or if I need to define everything inside the parallel loop (which would be very time intensive)

library(foreach)
library(doParallel)


clus <- makeCluster(10)
registerDoParallel(clus)
elmc1 <- foreach(l = 1:40,
     .combine = rbind,
     .packages = c("nlme","foreach", "doParallel"),
     .export = c("models.list","training.sets","testdatas","testdatas2")) %dopar% {

        testdatas <- testdatas2

        testdatas.temp <- foreach(k = 1:25,
                      .combine = rbind,
                      .packages = c("nlme","foreach", "doParallel"),
                      .export = c("models.list","training.sets","testdatas","testdatas2")) %do% {

            x <- lme(models.list[[l]], random = ~I(year-2001)|sa2_name11, data=training.sets[[k]])

            testdatas[[k]]$logrrprediction<-predict(x, newdata=testdatas[[k]],level=0)
            Designmat <- model.matrix(eval(eval(x$call$fixed)[-2]), testdatas[[k]])
            predvar <- diag(Designmat %*% x$varFix %*% t(Designmat))

            testdatas[[k]]$SE<- sqrt(predvar)
            testdatas[[k]]$SE2<-sqrt(predvar+x$sigma^2)
            testdatas[[k]]$lowerconfband<- testdatas[[k]]$logrrprediction - 0.75*testdatas[[k]]$SE2
            testdatas[[k]]$predicthot<-ifelse(testdatas[[k]]$lowerconfband>0,1,0)
            testdatas[[k]]$result<-ifelse((testdatas[[k]]$hot==1) & (testdatas[[k]]$predicthot==1), 1, ifelse((testdatas[[k]]$hot==0) & (testdatas[[k]]$predicthot==1),2,ifelse((testdatas[[k]]$hot==1) & (testdatas[[k]]$predicthot==0),3,ifelse((testdatas[[k]]$hot==0) & (testdatas[[k]]$predicthot==0),4,NA))))
                testdatas[[k]]$TP <- nrow(testdatas[[k]][testdatas[[k]]$result==1,])
            testdatas[[k]]$FP <- nrow(testdatas[[k]][testdatas[[k]]$result==2,])
            testdatas[[k]]$FN <- nrow(testdatas[[k]][testdatas[[k]]$result==3,])
            testdatas[[k]]$TN <- nrow(testdatas[[k]][testdatas[[k]]$result==4,])

            return(c(
             mean(testdatas[[k]]$TP),
             mean(testdatas[[k]]$FP),
             mean(testdatas[[k]]$FN)))
            }

    return(c(mean(test.datas.temp[,1])/mean((testdatas.temp[,1]) + mean(testdatastemp[,2])),
            mean(testdatas.temp[,1])/(mean(testdatas.temp[,1]) + mean(testdatas.temp[,3]))))
    }
    stopCluster(clus)
  • 1
    Please quote the exact error message. Is this is? "Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : unable to find variable "models.list"" Also, where does it come from in the reproducible example? I don't see it. – Hack-R Oct 04 '17 at 14:45
  • `x <- lme(models.list[[l]], random = ~I(year-2001)|sa2_name11, data=training.sets[[k]])` this is the part it comes from. I can't actually reproduce the error at the minute because the data is source located. But I get the export error message that models.list is already loaded and also one like yours which says models.list object could not be found – Rhys Jevon Oct 04 '17 at 16:06
  • That's not the part that creates the `models.list`, that's what I was referring to. Sorry if my comment was unclear. – Hack-R Oct 04 '17 at 18:06
  • `models.list <- list() for (i in 1:nrow(A.n2)) { if (sum(A.n2[i,])==1) { models.list[[i]] <- formula("logdsr_sa2_sum ~ 1") } else { models.list[[i]] <- formula(paste0("logdsr_sa2_sum ~ ",paste(colnames(X)[as.logical(A.n2[i,2:ncol(A.n2)])],collapse=" + "))) } }` – Rhys Jevon Oct 05 '17 at 07:30
  • Basically models.list is a list of formula strings – Rhys Jevon Oct 05 '17 at 07:38

0 Answers0