2

I am trying to implement an R parallel loop iteration, but not sure how to condition it so that it will only need to return (row-bind append) result to the main result dataset if certain condition is met. Meaning, in some situation I do not want the particular iteration step to return anything. Pseudo-example below:

library(foreach)
library(doParallel)

registerDoParallel(makeCluster(detectCores() - 1))

final.result <- foreach(i = 1:100, .combine=cbind) %dopar% {
   getResultDS = functionXYZ()
   ...
   ...
   ...
   # append function result to final.result only if getResultDS[1] > 0
   if (getResultDS[1,] == 0) {
      getResultDS
   }
}
...
...
...

Appreciate anyone's input here, thanks!

Aaron Chan
  • 21
  • 1
  • 1
    Assuming `final.result` is a list, why not have unwanted conditions return `NA`, then `final.result[is.na(final.result)] <- NULL`, which I believe will clean up your list to just the returns you want. – JMT2080AD Oct 17 '18 at 23:33

1 Answers1

0

The described behavior can be achieved by defining a .combine function that ignores NA values.

cbind_ignoreNA <- function(...){
    ll <- list(...)
    ll <- ll[unlist(lapply(ll, function(x) !(length(x)==1 && is.na(x))))]
    do.call("cbind", ll)                            
}

Then one can return NA (of length one) if a result from an iteration should not appear in the output of foreach(). In the following example the result of iteration i=2 is ignored:

library(foreach)
library(doParallel); registerDoParallel(2)
test <- foreach(i=1:4, .combine=cbind_ignoreNA) %dopar% {
    if(i==2)
        r <- NA
    else
        r <- i:(i+3)
    r
}
test
      [,1] [,2] [,3]
[1,]    1    3    4
[2,]    2    4    5
[3,]    3    5    6
[4,]    4    6    7
Nairolf
  • 2,418
  • 20
  • 34