3

I would like to use the plumber package to carry out some flexible parallel processing and was hoping it would work within a node.js framework such that it is non-blocking...

I have the following plumber file.

# myfile.R

#* @get /mean
normalMean <- function(samples=10){
  Sys.sleep(5)
  data <- rnorm(samples)
  mean(data)
}

I have also installed pm2 as suggested here http://plumber.trestletech.com/docs/hosting/

I have also made the same run-myfile.sh file i.e.

#!/bin/bash
R -e "library(plumber); pr <- plumb('myfile.R'); pr\$run(port=4000)"

and made it executable as suggested...

I have started up pm2 using

pm2 start /path/to/run-myfile.sh

and wanted to test to see if it could carry out a non-blocking node.js framework...

by opening up another R console and running the following...

foo <- function(){
    con <- curl::curl('http://localhost:4000/mean?samples=10000',handle = curl::new_handle())
    on.exit(close(con))
    return(readLines(con, n = 1, ok = FALSE, warn = FALSE))
}

system.time(for (i in seq(5)){
    print(foo())
})

Perhaps it is my miss-understanding of how a node.js non-blocking framework is meant to work, but in my head the last loop should take only a bit of over 5 seconds. But it seems to take 25 seconds, suggesting everything is sequential rather than parallel.

How could I use the plumber package to carry out that non-blocking nature?

h.l.m
  • 13,015
  • 22
  • 82
  • 169

2 Answers2

3

pm2 can't load-balance R processes for you, unfortunately. R is single-threaded and doesn't really have libraries that allow it to behave in asynchronous fashion like NodeJS does (yet), so there aren't many great ways to parallelize code like this in plumber today. The best option would be to run multiple plumber R back-ends and distribute traffic across them. See the "load balancing" section here: http://plumber.trestletech.com/docs/docker-advanced

Jeff Allen
  • 17,277
  • 8
  • 49
  • 70
  • 2
    There's a discussion going here, which would be the best place to continue the conversation: https://github.com/trestletech/plumber/issues/31 – Jeff Allen Aug 02 '16 at 03:56
-2

Basically concurrent requests are queued by httpuv so that it is not performant by itself. The author recommends multiple docker containers but it can be complicated as well as response-demanding.

There are other tech eg Rserve and rApache. Rserve forks prosesses and it is possible to configure rApache to pre-fork so as to handle concurrent requests.

See the following posts for comparison

https://www.linkedin.com/pulse/api-development-r-part-i-jaehyeon-kim/ https://www.linkedin.com/pulse/api-development-r-part-ii-jaehyeon-kim/

Jaehyeon Kim
  • 1,328
  • 11
  • 16