Handling python modules in R package development

Question

I want to create a R package that features functionalities from different python modules. The reticulate package offers a variety of functions to load and execute python modules from within R.

For the deployment of my package I was wondering what would be the most efficient way to handle the installation of the python modules upon installation of my R package.

As my python modules need python >= 3.7, I planned to include a virtual environment with python 3.7 in the R package. Upon installation, the necessary python modules should be downloaded and imported into the virtual environment.

I can't deploy my R package with an environment that already contains the necessary modules, as the file size of a such environment would exceed >1.5 GB. This seems a little big.

Is there a convenient way do deal with this problem?

I was thinking of something like this:

module <- NULL

.onload <- function(libname, pkgname){
   reticulate::use_virtuaenv("./path/to/my/environment/contained/in/the/package")

   if (!reticulate::py_module_available(python_module)){
      reticulate::py_install("module_name")
   }
   module <<- reticulate::import("module_name")
}

In the documentation of the reticulate package (https://rstudio.github.io/reticulate/articles/package.html), they recommend to provide wrapper functions, so that the user can define the virtual environment / python to use and that the user manually downloads the necessary modules. But that seems a little inconvenient to me, as in my opinion these modules should automatically install, once the R package is installed.

Does my approach make sense or is this just bad etiquette?

Handling python modules in R package development

0 Answers0