0

For one of my scripts I want to write an R function that checks if a package is already installed: if so it should use library() to import it in the namespace, otherwise it should install it and import it.

I assumed that pkgname is a string and tried to write something like:

ensure_library <- function(pkgname) {
  if (!require(pkgname)) {
    install.packages(pkgname, dependencies = TRUE)
  }
  require(pkgname)
}

As simple as is this function does not work. If I try to run it like ensure_library("dplyr") it installs the package dplyr but then it fails because it trys to import pkgname rather than dplyr in the namespace.

ensure_library("dplyr")
Loading required package: pkgname
Installing package into ‘/home/luca/R-dev’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/src/contrib/dplyr_0.5.0.tar.gz'
Content type 'application/x-gzip' length 708476 bytes (691 KB)
==================================================
downloaded 691 KB

* installing *source* package ‘dplyr’ ...
** package ‘dplyr’ successfully unpacked and MD5 sums checked
** libs

.... a lot of compiling here....

installing to /home/luca/R-dev/dplyr/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (dplyr)

The downloaded source packages are in
    ‘/tmp/Rtmpfd2Lep/downloaded_packages’
Loading required package: pkgname
Warning messages:
1: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘pkgname’
2: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘pkgname’

Also, if I now re-run it it will install dplyr once again.

I realize this is probably due to R non-standard-evaluation and I have tried several combination of eval/substitute/quote in order to make it work with require but I couldn't succeed.

Can somebody help me understanding what is going on and if there is some easy-fix?

If a function already implementing this exists I would like to know, but what I am really interested is understanding why my code does not work as intended.

lucacerone
  • 9,859
  • 13
  • 52
  • 80
  • 3
    Use `character.only=TRUE` in `require`. – IRTFM Aug 12 '16 at 15:57
  • Thanks 42 that actually works, but I can't properly understand why. I understand that with this option requires evaluates correctly the name of the package I want to install to "dplyr". But why in the R console I can do require(dplyr) and require("dplyr") without character.only = TRUE? and why using something like substitute(pkgname), eval(quote(pkgname)) etc didn't work? – lucacerone Aug 12 '16 at 16:05
  • Because that parameter prevents the `substitute()` operation on the symbol `pkgname`. It's actually the substitute operation that is tripping up your efforts. – IRTFM Aug 12 '16 at 16:07
  • I think I still don't get NSE then.. I thought that inside a function substitute would actually create an expression but substituting any variable that it could substitute.. in this case I would expect that as.character(substitute(package)) (in require) would actually cause it to evaluate to "dplyr". Why it is evaluated to "pkgname" (the name of the variable I use in ensure_library? – lucacerone Aug 12 '16 at 16:19
  • It does get converted to an expression, but that expression is `pkgname`, and the as.character` is what is finishing up the coercion. `substitute` does not evaluate, so there's no lookup. – IRTFM Aug 12 '16 at 16:22
  • You can create a reporting function to show that your attempts with substitute and quote would both be creating unevaluated R 'calls': `subres <- function(x) cat( class( substitute(x) ) )` tests: `subres(quote(pkgname))` # call ... and `subres(expression(pkgname))` # also call – IRTFM Aug 12 '16 at 16:33

2 Answers2

3

Expanding on suggestion to use character.only=TRUE: If you look at the code for require, you see that the first step is only performed when the default value of 'character.only' ( = FALSE) holds:

> require
function (package, lib.loc = NULL, quietly = FALSE, warn.conflicts = TRUE, 
    character.only = FALSE) 
{
    if (!character.only) 
        package <- as.character(substitute(package))
    loaded <- paste("package", package, sep = ":") %in% search()
    if (!loaded) {
        if (!quietly) 
            packageStartupMessage(gettextf("Loading required package: %s", 
                package), domain = NA)
        value <- tryCatch(library(package, lib.loc = lib.loc, 
            character.only = TRUE, logical.return = TRUE, warn.conflicts = warn.conflicts, 

# snipped rest of code

So leaving the default value of character.only in place forces the function to convert the symbol pkgname to a character value.

  as.character(substitute(pkgname))
 [1] "pkgname"

And since 'character.only' is also part of the library logic, and require calls library, you could have used library.

Further comment: You posted a follow-up to Rhelp and got some useful answers from Duncan Murdoch and Peter Dalgaard which clarified (I hope) this question. In the process I wondered whether your resistance to this answer comes about because of an expectation set up by the name of this function that substitution should occur but nothing was happening that looked like "substitution". That expectation seems perfectly reasonable I see now belatedly in retrospect. I think the correct name of the function could have been: substitute_but_only_on_the_basis_of_the_local_environment_or_second_argument. The more common use of substitute is with two arguments:

   y_val=45; a_val=99
   substitute( x + y == z + a , list( y= y_val, a = a_val)
   x + 45 == z + 99

There was no 'effort' to examine the values of any symbol in the first argument unless it had a named item in the second argument (which is named env.)

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • what I don't get is why package <- as.character(substitute(package)) is evaluated to "pkgname" (the name of the variable I use in ensure_library) rather than to "dplyr".. – lucacerone Aug 12 '16 at 16:23
  • Because `substitute` does not _evaluate_ its argument, meaning it doesn't use the symbol to go into the symbol-table and find what `pkgname` might have as its value. It's like `$` in this respect. – IRTFM Aug 12 '16 at 16:37
-2

The suggestions given above are already good and could address your problem. Nonetheless, you are re-inventing the wheel a bit there.

If you want to distribute R code, with documentation that has requirements on external packages and possibly needs proper testing, I would suggest that you make a package out of it. When the package is being installed, it is automatically ensured that all dependencies are available. Plus you have documentation with it and a place for your testing scripts. It keeps everything nicely in one place and is versioned at the same time.

Holger Hoefling
  • 388
  • 3
  • 13
  • Holger I am sorry, but you are missing the point of my question. I know you shoul use Depends Imports, roxygen2 etc when organizing your code in a package but that is not what I have to do now. I have to share a simple script and it seems quite an overkill in that respect to make an R package when I can just place a simple function at the very beginning and using it. – lucacerone Aug 14 '16 at 11:23