1

I am trying to do parallel processing in R shiny, the parallel task which I want to do is a call to python script. However it does not work and not able to fetch the result back from python to R. Below is the sample R shiny and Python code. App.R

library(shiny)
library(reticulate)
library(doParallel)
library(foreach)
ui <- fluidPage(

   # Application title
   titlePanel("Sample Program"),

      mainPanel(
         uiOutput("txtValue")
      )   
)
server <- function(input, output) {

  source_python("../../PythonCode/Multiprocessing/multip.py")  

  cl <- makeCluster(detectCores(), type='PSOCK')
  registerDoParallel(cl)

  result <- foreach(i=1:5) %dopar% fsq(i)
  stopCluster(cl)     
   output$txtValue <- renderUI({
    result   
   }) 

}
shinyApp(ui = ui, server = server)

Python Code (multip.py)

def fsq(x):
    return x**2
jacob mathew
  • 143
  • 1
  • 11
  • Where does `source_python` come from? What do you mean by "it does not work"? – Ralf Stubner Jun 26 '18 at 16:02
  • the python function call is not executed, it gives error at the line result <- foreach(i=1:5) %dopar% fsq(i) – jacob mathew Jun 26 '18 at 18:05
  • the source_python is to reference the python script, it comes from reticulate package. the error message is "Error in unserialize(socklist[[n]]) : error reading from connection" – jacob mathew Jun 26 '18 at 18:22

1 Answers1

2

The error message is independent of shiny:

library(reticulate)
library(doParallel)
library(foreach)
library(parallel)

source_python("multip.py")  

cl <- makeCluster(detectCores(), type = 'PSOCK')
registerDoParallel(cl)

# throws: Error in unserialize(socklist[[n]]) : error reading from connection
foreach(i = 1:5) %dopar% fsq(i)

stopCluster(cl)     

I interpret this such that one cannot serialize a Python function as one can serialize a R function. A simple workaround is to use source_python within the loop:

library(doParallel)
library(foreach)
library(parallel)

cl <- makeCluster(detectCores(), type = 'PSOCK')
registerDoParallel(cl)

foreach(i = 1:5) %dopar% {
  reticulate::source_python("multip.py")  
  fsq(i)
}
stopCluster(cl)     
Ralf Stubner
  • 26,263
  • 3
  • 40
  • 75
  • this worked, thanks ! yes, it makes sense to have complete code for the parallel task inside the loop especially when it is referencing an outside script/function – jacob mathew Jun 27 '18 at 03:55
  • 1
    To expand on Ralf Stubner's answer: the `fsq` R function relies on an object of type `externalptr`. Such objects cannot be serialize/unserialized unless the developer behind such objects have implemented support for it. (I yet have to see an example of that.) – HenrikB Jun 27 '18 at 04:17