1

In R, how do I define a function like f <- function(x){...} which can only take the argument x to be 'a' or 'b'?

[BTW, what is the current (as of 2020) best/easy/practical package for creating functions in R? Is it still roxygen?]

LucasMation
  • 2,408
  • 2
  • 22
  • 45
  • 4
    Maybe `if(!x %in% c('a', 'b')) stop('Input should be either a or b')` in the first line of the function. – Ronak Shah Sep 08 '20 at 04:40
  • 2
    Also, roxygen does not create functions. It simply allows you to document your functions if you are creating a package. And typically asking for "best" solutions for something is usually an opinion based question which is considered off-topic for Stack Overflow. – MrFlick Sep 08 '20 at 04:46
  • 2
    `f <- function(x = c("a","b")) { x <- match.arg(x); ... }`? – r2evans Sep 08 '20 at 04:48
  • @MrFlick good point, I slipped in the question. I will edit later. – LucasMation Sep 08 '20 at 05:39
  • @r2evans, tks. This sees great. But, does R now not to consider the vector `c("a","b")` as de default value for `x`? – LucasMation Sep 08 '20 at 05:41
  • 1
    Yes it does, until you call [`match.arg`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/match.arg.html). Its default is `several.ok=FALSE`, so with the default behavior, that code in my previous comment will only allow a single value. This means that while the function definition *suggests* a vector is accepted, only the first is used. (Typically this ambiguity is explicitly discussed in the function help docs.) Notice, though, that it does partial matching (ala `pmatch`), so `match.arg('a',c('aa','bb'))` returns `"aa"`. – r2evans Sep 08 '20 at 06:05

1 Answers1

4

One canonical (idiomatic) way in R to allow one of so many values for a function argument is to use match.arg:

f <- function(x = c("a", "b")) {
  x <- match.arg(x)
}

Since it defaults to several.ok=FALSE, it reduces x to a single argument. If nothing is provided, then it will default to the first value in the vector.

However, note that since it uses pmatch internally, it allows partial matching. This means:

match.arg('a',c('aa','bb'))
# [1] "aa"

which may not be desired. In that case, then RonakShah's first suggestion (modified to check length as well) is likely the most direct, one of the following if statements:

f <- function(x) {
  if (is.null(x) || length(x) != 1L || !x %in% c("a", "b")) {
    stop("'x' argument must be one of: 'a','b'")
  }
  ### or, if 'x' can have length>1, then
  if (is.null(x) || !length(x) || !all(x %in% c("a", "b"))) {
    stop("'x' argument must be one of: 'a','b'")
  }
  # ...
}

If x can legitimately have a length greater than 1, update the conditional to be !all(x %in% c("a","b")).

FYI: roxygen is not for creating functions. I believe its power initially was that it kept the documentation in the same place as the code, which might prod the developer into updating the documentation when the function itself is modified. I believe in practice that this is perhaps too optimistic, especially with large functions where the roxygen docs are well off the page.

However, another benefit (in a subsequent release) is that it supports markdown-formatted documentation, no longer requiring the user to use R-centric formatting (e.g., for URLs, links to other R functions/packages, code, emphasis, etc).

r2evans
  • 141,215
  • 6
  • 77
  • 149