How to write a testthat unit test for a function that returns a data frame

Question

I am writing a script that ultimately returns a data frame. My question is around if there are any good practices on how to use a unit test package to make sure that the data frame that is returned is correct. (I'm a beginning R programmer, plus new to the concept of unit testing)

My script effectively looks like the following:

# initialize data frame
df.out <- data.frame(...)

# function set
function1 <- function(x) {...}
function2 <- function(x) {...}

# do something to this data frame
df.out$new.column <- function1(df.out)

# do something else
df.out$other.new.column <- function2(df.out)

# etc ....

... and I ultimately end up with a data frame with many new columns. However, what is the best approach to test that the data frame that is produced is what is anticipated, using unit tests?

So far I have created unit tests that check the results of each function, but I want to make sure that running all of these together produces what is intended. I've looked at Hadley Wickham's page on testing but can't see anything obvious regarding what to do when returning data frames.

My thoughts to date are:

Create an expected data frame by hand
Check that the output equals this data frame, using expect_that or similar

Any thoughts / pointers on where to look for guidance? My Google-fu has let me down considerably on this one to date.

Like [this](https://github.com/hadley/dplyr/blob/master/tests/testthat/test-colwise.R)? — Roland, Mar 26 '15 at 15:52

score 14 · Accepted Answer · answered Mar 26 '15 at 15:52

Your intuition seems correct. Construct a data.frame manually based on the expected output of the function and then compare that against the function's output.

# manually created data
dat <- iris[1:5, c("Species", "Sepal.Length")]

# function
myfun <- function(row, col, data) {
    data[row, col]
}

# result of applying function
outdat <- myfun(1:5, c("Species", "Sepal.Length"), iris)

# two versions of the same test
expect_true(identical(dat, outdat))
expect_identical(dat, outdat)

If your data.frame may not be identical, you could also run tests in parts of the data.frame, including:

dim(outdat), to check if the size is correct
attributes(outdat) or attributes of columns
sapply(outdat, class), to check variable classes
summary statistics for variables, if applicable
and so forth

score 2 · Answer 2 · answered Apr 23 '15 at 07:02

If you would like to test this at runtime, you should check out the excellent ensurer package, see here. At the bottom of the page you can see how to construct a template that you can test your dataframe against, you can make it as detailed and specific as you like.

score 0 · Answer 3 · answered May 13 '15 at 19:35

0

I'm just using something like this

d1 <- iris
d2 <- iris 
expect_that(d1, equals(d2)) # passes
d3 <- iris
d3[141,3] <- 5
expect_that(d1, equals(d3)) # fails

answered May 13 '15 at 19:35

pdb

1,574
12
26

How to write a testthat unit test for a function that returns a data frame

3 Answers3