3

Update: Question here is closed, now discussed on RStudio Community Platform.


I'm trying to program defensively in my package development, using a lot input validation. In particular, I'm relying on a lot of the ready-made assertions in checkmate, testthat and the like, which makes life a lot easier (and code shorter).

Hadley Wickhams's tidyverse style guide for error messages suggests that error messages should point users to the exact source of the problem, like so:

#> Error: Can't find column `b` in `.data`

(Columns are just an example, sometimes it might be rows, or some other index).

I'm now wondering how this can be implemented elegantly and consistently in a package, given that a lot of the existing assertions (from above package, but also base r) don't give you any indices back in their errors.

Here's an example:

m <- matrix(data = c(0, 1, 5, -2), nrow = 2)

# arbitrary assertion
assert_positive <- function(x) {
  if (any(x < 0)) {
    stop(call. = FALSE,
         "All numbers must be non-negative")
  } else {
    return(invisible(x))
  }
}
# (there are *lots* of these in packages such as checkmate, testthat or assertr that should be reused)

assert_positive(m)

gives:

## Error: All numbers must be non-negative

So far so good, but this does not give the desired indices of the errors.

Yes, I know that I could just change the above assert_positive() function to do that, but I would like to reuse a lot of the functions in checkmate, testthat and friends, so I can't touch them, and there's too many of them anyway.

So I should probably wrap something around these existing tests, such as a simple for loop:

# via for-loops
assert_positive2 <- function(x) {
  for (r in 1:nrow(x)) {
    res <- try(expr = assert_positive(x[r, ]), silent = TRUE)
    if (inherits(x = res, what = "try-error")) {
      stop(
        call. = FALSE,
        paste0(
          "in row ",
          r,
          ": ",
          attr(x = res, which = "condition")$message,
          "."
        )
      )
    }
  }
}

assert_positive2(m)

gives:

## Error: in row 2: All numbers must be non-negative.

That gets the job done, but it's a lot of clutter and the code is not very expressive. I've also thought about Reduce() with try(), but that won't give indices, and neither would any apply() action. I guess, finally, a closure or function factory would be helpful to generalise this to many assertions.

This just feels like a problem that many other people (crafting better error messages) must have already run into, so:

What's an elegant/canonical way to do this?

I know this isn't the place for discussions and opinions; but it's still the best forum for such a problem, so please don't shut this down.

maxheld
  • 3,963
  • 2
  • 32
  • 51
  • I've posted this question on the [RStudio Community Forum](https://community.rstudio.com/t/implementation-best-practices-for-good-error-messages/4927) which might be a better place for such an open discussion after all. Feel free to close. – maxheld Feb 05 '18 at 20:48

1 Answers1

1

I don't see how wrapping many functions would be less work than just changing them / writing your own versions. Plus, like you say, the way you've wrapped the example is anything but cute.

As a short answer, I could imagine using the assertthat package (which you have not mentioned explicitly) and in particular the functions assert_that() (for basic cases) and on_failure() (for broader user-defined assertion functions).

I don't think the assert_positive example does what you want, so maybe you should not try to recycle it. Similarly, the assert_positive2 might also not do what you want in other cases, because you may want to report the specific indices per row that are in violation, not just the rows. But with your own functions, you can maybe write something more flexible that covers multiple cases.

RolandASc
  • 3,863
  • 1
  • 11
  • 30