4

We are trying to run a code involving parallelization in R using the ResistanceGA package which calls the doParallel package. We have extremely large memory so this should not be the issue.

This is the error we get:

Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal
Error in serialize(data, node$con, xdr = FALSE) :
  error writing to connection

Here is a reproducible example code, copied from the tutorial, which triggers the issue with our particular setup:

write.dir <- #please fill here
library(ResistanceGA)
data(resistance_surfaces)
data(samples)
sample.locales <-SpatialPoints(samples[,c(2,3)])
r.stack <-stack(resistance_surfaces$categorical,resistance_surfaces$continuous,resistance_surfaces$feature)
GA.inputs <-GA.prep(ASCII.dir = r.stack,Results.dir = write.dir,method = "LL",max.cat = 500,max.cont = 500,seed = 555,parallel = 4)
gdist.inputs <-gdist.prep(length(sample.locales),samples = sample.locales,method ='commuteDistance')
PARM <-c(1, 250, 75, 1, 3.5, 150, 1, 350)
Resist <-Combine_Surfaces(PARM = PARM,gdist.inputs = gdist.inputs,GA.inputs = GA.inputs,out = NULL,rescale = TRUE)
gdist.response <-Run_gdistance(gdist.inputs = gdist.inputs,r = Resist)
gdist.inputs <-gdist.prep(n.Pops =length(sample.locales),samples = sample.locales,response =as.vector(gdist.response),method ='commuteDistance')
Multi.Surface_optim <-MS_optim(gdist.inputs = gdist.inputs,GA.inputs = GA.inputs)

Session info:

R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ResistanceGA_4.1-0.46 raster_3.4-10         sp_1.4-5

loaded via a namespace (and not attached):
 [1] jsonlite_1.7.2        splines_4.0.5         foreach_1.5.1
 [4] gtools_3.8.2          shiny_1.6.0           expm_0.999-6
 [7] stats4_4.0.5          spatstat.geom_2.1-0   LearnBayes_2.15.1
[10] pillar_1.6.1          lattice_0.20-44       glue_1.4.2
[13] digest_0.6.27         promises_1.2.0.1      polyclip_1.10-0
[16] minqa_1.2.4           colorspace_2.0-1      MuMIn_1.43.17
[19] htmltools_0.5.1.1     httpuv_1.6.1          Matrix_1.3-3
[22] plyr_1.8.6            spatstat.sparse_2.0-0 JuliaCall_0.17.4
[25] pkgconfig_2.0.3       gmodels_2.18.1        purrr_0.3.4
[28] xtable_1.8-4          spatstat.core_2.1-2   scales_1.1.1
[31] gdata_2.18.0          tensor_1.5            XR_0.7.2
[34] later_1.2.0           spatstat.utils_2.1-0  lme4_1.1-27
[37] proxy_0.4-25          tibble_3.1.2          mgcv_1.8-35
[40] generics_0.1.0        ggplot2_3.3.3         ellipsis_0.3.2
[43] XRJulia_0.9.0         cli_2.5.0             magrittr_2.0.1
[46] crayon_1.4.1          mime_0.10             deldir_0.2-10
[49] fansi_0.4.2           doParallel_1.0.16     nlme_3.1-152
[52] MASS_7.3-54           class_7.3-19          tools_4.0.5
[55] lifecycle_1.0.0       munsell_0.5.0         e1071_1.7-6
[58] gdistance_1.3-6       akima_0.6-2.1         compiler_4.0.5
[61] rlang_0.4.11          units_0.7-1           classInt_0.4-3
[64] grid_4.0.5            nloptr_1.2.2.2        iterators_1.0.13
[67] goftest_1.2-2         igraph_1.2.6          miniUI_0.1.1.1
[70] boot_1.3-28           GA_3.2.1              gtable_0.3.0
[73] codetools_0.2-18      abind_1.4-5           DBI_1.1.1
[76] R6_2.5.0              knitr_1.33            dplyr_1.0.6
[79] fastmap_1.1.0         utf8_1.2.1            ggExtra_0.9
[82] spdep_1.1-7           KernSmooth_2.23-20    spatstat.data_2.1-0
[85] parallel_4.0.5        Rcpp_1.0.6            vctrs_0.3.8
[88] sf_0.9-8              rpart_4.1-15          coda_0.19-4
[91] spData_0.3.8          tidyselect_1.1.1      xfun_0.23

We have tried reinstalling everything with different versions, to no avail. It works on Windows.

Julian Wittische
  • 1,219
  • 14
  • 22
  • 2
    That error strongly suggests that the parallel R worker has crashed/terminated/segfaulted. If it works on MS Windows, but not on Linux or macOS, it strongly suggests your particular code cannot run in a _forked_ parallel processing. To get the same type of workers as on Windows, use `cl <- makeCluster(cl); registerDoParallel(cl)`. Then retry. – HenrikB May 26 '21 at 01:23
  • @HenrikB Thank you very much for the help. I tried what you suggested and got a slightly different error: `Error in serialize(data, node$con, xdr = FALSE) : error writing to connection Error in serialize(data, node$con, xdr = FALSE) : error writing to connection` – Julian Wittische May 26 '21 at 03:44
  • 1
    Ok, that error suggests that your parallel workers is not loner alive. So, it might be that you've been just lucky that it worked on MS Windows (e.g. slightly more memory, ..., unknown factor). Regardless, the code sounds unstable for parallelization, and it's likely not your fault. To narrow in further on this, try with `library(doFuture); registerDoFuture(); plan(multisession, workers=ncores); options(future.globals.onReference="error")` instead. That will use the exact same type of parallel backend (PSOCK workers) but you should get a more informative error message. – HenrikB May 26 '21 at 18:16
  • @HenrikB I tried running the code you provided instead of the previous one, before the analysis code, but unfortunately, I got the same error message. – Julian Wittische May 28 '21 at 08:00
  • 1
    Is the hardware on the Ubuntu machine dedicated to your user? When I run parallelized code on a shared Ubuntu server, I tend to run into serialize errors. The server provider guarantees access to a given number of CPUs and RAM at all times. However, my guess is that those CPUs are not constant. If the server moves CPUs between users, that may impede thread stability and break the parallel process. Unfortunately, the frequency of these serialize errors increased over the past years. Updates of `parallel` and the underlying `snow` dealing with this - like other programs do - are long overdue. – Chr Jun 01 '21 at 20:07

1 Answers1

3

It seems to be because registerGA does not work on forked parallel processing. The way it is implemented, or rather GA:::startParallel() is that it'll use forked parallel processing if you're on Unix or macOS. On MS Windows, you'll get PSOCK-based parallel processing.

The following works on R 4.1.0 with Linux.

## Not on CRAN (https://github.com/wpeterman/ResistanceGA)
library(ResistanceGA)

## Use PSOCK background workers for parallel processing
parallel <- parallel::makeCluster(4L)

write.dir <- tempdir()
data(resistance_surfaces)
data(samples)
sample.locales <- SpatialPoints(samples[,c(2,3)])

r.stack <- stack(resistance_surfaces$categorical, resistance_surfaces$continuous, resistance_surfaces$feature)

GA.inputs <- GA.prep(ASCII.dir = r.stack, Results.dir = write.dir, method = "LL",max.cat = 500, max.cont = 500, seed = 555, parallel = parallel)

gdist.inputs <- gdist.prep(length(sample.locales), samples = sample.locales,method = "commuteDistance")

PARM <- c(1, 250, 75, 1, 3.5, 150, 1, 350)

Resist <- Combine_Surfaces(PARM = PARM, gdist.inputs = gdist.inputs, GA.inputs = GA.inputs, out = NULL, rescale = TRUE)

gdist.response <- Run_gdistance(gdist.inputs = gdist.inputs, r = Resist)

gdist.inputs <- gdist.prep(n.Pops = length(sample.locales), samples = sample.locales,response = as.vector(gdist.response), method = "commuteDistance")

Multi.Surface_optim <- MS_optim(gdist.inputs = gdist.inputs, GA.inputs = GA.inputs)
HenrikB
  • 6,132
  • 31
  • 34