0

EDIT: the initial resonses suggest my write-up focused people's attention on questions of best practices rather than questions of technique. I'd like to focus on a technical issue, however, with the below just as a toy example:

If a person passes a list to a function parameter, how can you capture and inspect individual elements of that list without risking errors from the system attempting to call/evaluate those elements?

For instance, if a user passes to a parameter a list of functions that may or may not be appropriate, or have the associated packages loaded, how can the function safely examine what functions were requested?


Say I would like to build a function that iterates through other functions that might be applied. The actual example would call different modeling functions, but here's a toy example that's easier to see:

newfunc <- function(func.list){
  lapply(func.list, 
         function(f){
           f(letters)
         }
  )
}

Let's say that among the functions newfunc() can take are the functions nchar() and length(). If we provide those, we get the following:

newfunc(
  func.list = list(nchar, length)
)


[[1]]
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

[[2]]
[1] 26

But, let's say that newfunc() is also capable of taking something like str_to_upper(), which comes from the package stringr. Passing str_to_upper() works fine, but only if stringr has been loaded beforehand:

newfunc(
  func.list = list(nchar, length, str_to_upper)
)

Error in lapply(func.list, function(f) f(letters)) : 
  object 'str_to_upper' not found


require(stringr)

newfunc(func.list = list(nchar, length, str_to_upper))
[[1]]
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

[[2]]
[1] 26

[[3]]
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O"
[16] "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"


I'd like to put code in the function that can investigate the elements of the list and determine whether any packages (like stringr) need to be loaded. Also, I'd like to check whether the functions listed are from an acceptable set (so it catches if someone passes mean() or, worse, rcorr() from an unloaded Hmisc).

# This works here but is undesireable:
newfunc(func.list = list(nchar, length, str_to_upper, mean))

# This creates issues no matter what:
newfunc(func.list = list(nchar, length, str_to_upper, rcorr))
require(Hmisc)
newfunc(func.list = list(nchar, length, str_to_upper, rcorr))

I know how to do something like func.list.test <- deparse(substitute(func.list) to get the literal text of the parameter, but I don't know how to do that on individual elements without risking triggering an error if some function isn't present.

(and I don't want to take the hacky route of string manipulation on the overall deparsed output of func.list.test)

Ideally for this use case I'd like to know if this can be done with base R techniques. However, feel free to explain how to do this using newer approaches like tidy evaluation/quosures if it's the best/only way (though know that my familiarity with those is currently pretty limited).

Any help would be appreciated.

Joe
  • 119
  • 5
  • Try with `newfunc(func.list = list(nchar, length, stringr::str_to_upper))` – akrun Jan 15 '20 at 23:35
  • For the `rcorr`, you need to provide the correct dataset because the one in `newfunc` will not work `newfunc(func.list = list(nchar, length, stringr::str_to_upper, Hmisc::rcorr))` – akrun Jan 15 '20 at 23:37
  • Sure, but I'm asking from the perspective of the function WRITER, not the function user. I'm trying to make the function robust to know it should either load stringr, or as you say append stringr::, even if the function user does not do so. I'm also trying to catch if users requested unacceptable functions like mean or rcorr, and throw an appropriate error. – Joe Jan 15 '20 at 23:38
  • Are you saying that the user doesn't know which package it comes so, he can't do the `stringr::`, I would do `??str_to_upper` to narrow down. If you want the function to give error then either you can use `stopifnot` or `tryCatch` – akrun Jan 15 '20 at 23:39
  • 1
    I think that's a really bad idea. There are so many packages (I have more than 600 installed, CRAN has thousands), that a typo in a function name would have a good chance of matching some other function in some other package. E.g. maybe I type `plot3d` when I mean `plot3D`. Trying to guess what users want when they say something ambiguous is likely to lead to bad bugs. – user2554330 Jan 15 '20 at 23:45
  • I'm aiming for simplicity, and users make mistakes. I'd like a user to be able to request some set of different modeling functions from different packages (in reality, things like lm(), glm(), lmer(), glmmTMB(), etc.), without having to worry about whether those are loaded, but without requesting/loading other packages that users may not require for their use case, or even have installed. – Joe Jan 15 '20 at 23:49
  • user2554330, I'm not trying to guess broadly. I'm working from a specific set of functions that are in theory acceptable. All I want is to be able to inspect what functions a user might have requested, without triggering an error if (for instance), an associated package isn't loaded. – Joe Jan 15 '20 at 23:50

1 Answers1

1

Here's a pure base function that uses find() to determine what function is being used and help.search() to locate any installed packages that might have the function:

resolve <- function( func.list )
{
  ## Disassemble the supplied list of functions (lfs)
  lf <- as.list(substitute( func.list ))[-1]
  lfs <- lapply( lf, deparse )
  lfs <- setNames( lfs, lfs )

  ## Find functions (ff) in the loaded namespaces
  ff <- lapply( lfs, find )

  ## Existing functions (fex) are listed in the order of masking
  ##   The first element is used by R in the absence of explicit ::
  fex <- subset( ff, lapply(ff, length) > 0 )
  fex <- lapply( fex, `[`, 1 )

  ## Search for empty entries (ee) among installed packages
  ee <- names(subset( ff, lapply(ff, length) < 1 ))
  ee <- setNames( ee, ee )
  eeh <- lapply( ee, function(e)
      help.search( apropos = paste0("^", e, "$"),
                  fields = "name", ignore.case=FALSE )$matches$Package )

  ## Put everything together
  list( existing = fex, to_load = eeh )
}

Example usage:

resolve(func.list = list(nchar, length, str_to_upper, lag, between))
# List of 2
#  $ existing:List of 3
#   ..$ nchar : chr "package:base"
#   ..$ length: chr "package:base"
#   ..$ lag   : chr "package:stats"
#  $ to_load :List of 2
#   ..$ str_to_upper: chr "stringr"
#   ..$ between     : chr [1:3] "data.table" "dplyr" "rex"

library(dplyr)
resolve(func.list = list(nchar, length, str_to_upper, lag, between))
# List of 2
#  $ existing:List of 4
#   ..$ nchar  : chr "package:base"
#   ..$ length : chr "package:base"
#   ..$ lag    : chr "package:dplyr"
#   ..$ between: chr "package:dplyr"
#  $ to_load :List of 1
#   ..$ str_to_upper: chr "stringr"

library(data.table)
resolve(func.list = list(nchar, length, str_to_upper, lag, between))
# List of 2
#  $ existing:List of 4
#   ..$ nchar  : chr "package:base"
#   ..$ length : chr "package:base"
#   ..$ lag    : chr "package:dplyr"
#   ..$ between: chr "package:data.table"
#  $ to_load :List of 1
#   ..$ str_to_upper: chr "stringr"
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • This works, thanks! Just a note to other users--the example function is for demonstration, and does NOT convey a way to capture this solution into a function that other functions might call. In other words, you can add this code to give this functionality to `newfunction`, but you can't just write some `newfunction` that calls `resolve`. – Joe Mar 21 '20 at 16:32