I want to create a R package that features functionalities from different python modules. The reticulate package offers a variety of functions to load and execute python modules from within R.
For the deployment of my package I was wondering what would be the most efficient way to handle the installation of the python modules upon installation of my R package.
As my python modules need python >= 3.7, I planned to include a virtual environment with python 3.7 in the R package. Upon installation, the necessary python modules should be downloaded and imported into the virtual environment.
I can't deploy my R package with an environment that already contains the necessary modules, as the file size of a such environment would exceed >1.5 GB. This seems a little big.
Is there a convenient way do deal with this problem?
I was thinking of something like this:
module <- NULL
.onload <- function(libname, pkgname){
reticulate::use_virtuaenv("./path/to/my/environment/contained/in/the/package")
if (!reticulate::py_module_available(python_module)){
reticulate::py_install("module_name")
}
module <<- reticulate::import("module_name")
}
In the documentation of the reticulate package (https://rstudio.github.io/reticulate/articles/package.html), they recommend to provide wrapper functions, so that the user can define the virtual environment / python to use and that the user manually downloads the necessary modules. But that seems a little inconvenient to me, as in my opinion these modules should automatically install, once the R package is installed.
Does my approach make sense or is this just bad etiquette?