2

Essentially the same error as here, but those solutions do not provide enough information to replicate a working example: Rpy2 in a Flask App: Fatal error: unable to initialize the JIT

Within my Flask app, using the rpy2.rinterface module, whenever I intialize R I receive the same stack usage error:

import rpy2.rinterface as rinterface 
from rpy2.rinterface_lib import openrlib

with openrlib.rlock: 
    rinterface.initr()

Error: C stack usage 664510795892 is too close to the limit Fatal error: unable to initialize the JIT

rinterface is the low-level R hook in rpy2, but the higher-level robjects module gives the same error. I've tried wrapping the context lock and R initialization in a Process from the multiprocessing module, but have the same issue. Docs say that a multithreaded environment will cause problems for R: https://rpy2.github.io/doc/v3.3.x/html/rinterface.html#multithreading But the context manager doesn't seem to be preventing the issue with interfacing with R

2 Answers2

1

rlock is an instance of a Python's threading.Rlock. It should take care of multithreading issues.

However, multitprocessing can cause a similar issue if the embedded R is shared across child processes. The code for this demo script showing parallel processing with R and Python processes illustrate this: https://github.com/rpy2/rpy2/blob/master/doc/_static/demos/multiproc_lab.py

I think that the way around this is to configure Flask, or most likely your wsgi layer, to create isolated child processes, or have all of your Flask processes delegate R calculations to a secondary process (created on the fly, or in a pool of processes waiting for tasks to perform).

lgautier
  • 11,363
  • 29
  • 42
1

As other answers for similar questions have implied, Flask users will need to initialize and run rpy2 outside of the WSGI context to prevent the embedded R process from crashing. I accomplished this with Celery, where workers provide an environment separate from Flask to handle requests made in R.

I used the low-level rinterface library as mentioned in the question, and wrote Celery tasks using classes

import rpy2.rinterface as rinterface
from celery import Celery

celery = Celery('tasks', backend='redis://', broker='redis://')

class Rpy2Task(Task):   
    def __init__(self):
        self.name = "rpy2"

    def run(self, args):    
        rinterface.initr()
        r_func = rinterface.baseenv['source']('your_R_script.R')
        r_func[0](args) 
        pass

Rpy2Task = celery.register_task(Rpy2Task())
async_result = Rpy2Task.delay(args)

Calling rinterface.initr() anywhere but in the body of the task run by the worker results in the aforementioned crash. Celery is usually packaged with redis, and I found this a useful way to support exchanging information between R and Python, but of course Rpy2 also provides flexible ways of doing this.

  • Thanks for the workaround. There was a buglet in rpy2, or R's C-API depending on how you look at it (https://github.com/rpy2/rpy2/issues/729). This is fixed with rpy2-3.4.2. – lgautier Jan 10 '21 at 16:15