0

I'm not understanding how to do indirect subscripting in %dopar% or in llply( .parallel = TRUE). My actual use-case is a list of formulas, then generating a list of glmer results in a first foreach %dopar%, then calling PBmodcomp on specific pairs of results in a separate foreach %dopar%. My toy example, using numeric indices rather than names of objects in the lists, works fine for %do% but not %dopar%, and fine for alply without .parallel = TRUE but not with .parallel = TRUE. [My real example with glmer and indexing lists by names rather than by integers works with %do% but not %dopar%.]

library(doParallel)
library(foreach)
library(plyr)
cl <- makePSOCKcluster(2)  # tiny for toy example
registerDoParallel(cl)

mB <- c(1,2,1,3,4,10)
MO <- c("Full", "noYS", "noYZ", "noYSZS", "noS", "noZ", 
        "noY", "justS", "justZ", "noSZ", "noYSZ")

# Works
testouts <- foreach(i = 1:length(mB)) %do% {
#                  mB[i]
                  MO[mB[i]]
                  }
testouts
# all NA
testouts2 <- foreach(i = 1:length(mB)) %dopar% {
#                  mB[i]
                  MO[mB[i]]
                  }
testouts2  
# Works
testouts3 <- alply(mB, 1, .fun = function(i) { MO[mB[i]]} )
testouts3
# fails "$ operator is invalid for atomic vectors"
testouts4 <- alply(mB, 1, .fun = function(i) { MO[mB[i]]},              
                  .parallel = TRUE,
                  .paropts = list(.export=ls(.GlobalEnv)))
testouts4
stopCluster(cl)

I've tried various combinations of double brackets like MO[mB[[i]]], to no avail. mB[i] instead of MO[mB[i]] works in all 4 and returns a list of the numbers. I've tried .export(c("MO", "mB")) but just get the message that those objects are already exported.

I assume that there's something I misunderstand about evaluation of expressions like MO[mB[i]] in different environments, but there may be other things I misunderstand, too.

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252

attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base

other attached packages: [1] plyr_1.8.4 doParallel_1.0.13 iterators_1.0.9 foreach_1.5.0

loaded via a namespace (and not attached): [1] compiler_3.5.1
tools_3.5.1 listenv_0.7.0 Rcpp_0.12.17 [5] codetools_0.2-15 digest_0.6.15 globals_0.12.1 future_1.8.1
[9] fortunes_1.5-5

tom 2
  • 334
  • 2
  • 3

1 Answers1

0

The problem appears to be with version 1.5.0 of foreach on r-forge. Version 1.4.4 from CRAN works fine for both foreach %do par% and llply( .parallel = TRUE). For anyone finding this post when searching for %dopar% with lists, here's the code where mList is a named list of formulas, and tList is a named list of pairs of model names to be compared.

tList <- list(Z1 = c("Full", "noYZ"),
              Z2 = c("noYS", "noYSZS"),
              S1 = c("Full", "noYS"),
              S2 = c("noYZ", "noYSZS"),
              A1 = c("noYSZS", "noY"),
              A2 = c("noSZ", "noYSZ")
             )
cl <- makePSOCKcluster(params$nCores) # value from YAML params:
registerDoParallel(cl)

# first run the models
modouts <- foreach(imod = 1:length(mList), 
                   .packages = "lme4") %dopar% {
                   glmer(as.formula(mList[[imod]]),
                         data = dsn, 
                         family = poisson,
                         control = glmerControl(optimizer = "bobyqa",
                                            optCtrl = list(maxfun = 100000),
                                            check.conv.singular = "warning")
                      )
                  }
names(modouts) <- names(mList)

####
# now run the parametric bootstrap tests
nSim <- 500
testouts <- foreach(i = seq_along(tList),
                    .packages = "pbkrtest") %dopar% {
                     PBmodcomp(modouts[[tList[[i]][1]]],
                               modouts[[tList[[i]][2]]],
                               nsim = nSim)
                    }
names(testouts) <- names(tList)

stopCluster(Cl)
tom 2
  • 334
  • 2
  • 3