Using this example data:
library(tidyverse)
set.seed(123)
df <- data_frame(X1 = rep(LETTERS[1:4], 6),
X2 = sort(rep(1:6, 4)),
ref = sample(1:50, 24),
sampl1 = sample(1:50, 24),
var2 = sample(1:50, 24),
meas3 = sample(1:50, 24))
I can use summarise_at()
to count the number of values in a subset of columns:
df %>% summarise_at(vars(contains("2")), funs(sd_expr = n() ))
This isn't very exciting as it is the same as the number of rows. However it would be useful in a table with a nested column with each cell containing a data frame with a differing number of rows in each cell.
For example,
df %>%
mutate_at(vars(-one_of(c("X1", "X2", "ref"))), funs(first = . - ref)) %>%
mutate_at(vars(contains("first")), funs(second = . *2 )) %>%
nest(-X1) %>%
mutate(mean = map(data,
~ summarise_at(.x, vars(contains("second")),
funs(mean_second = mean(.) ))),
n = map(data,
~ summarise_at(.x, vars(contains("second")),
funs(n_second = n() ))) ) %>%
unnest(mean, n)
However I get the error:
Error in mutate_impl(.data, dots) : Evaluation error: Can't create call to non-callable object.
Why does the mean()
function work in this context and n()
does not?
Now a couple of work arounds could be either:
n = map(data, ~ summarise_at(.x, vars(contains("second")),
funs(n_second = length(unique(.)) )))
but this is not robust to when there are identical values on different rows or alternatively:
n = map(data, ~ nrow(.x) )
but this does not allow me to build more complicated summarise_at()
functions which is what I'm really aiming for. Ultimately I'd like to do something like this to calculate standard errors:
se = map(data, ~ summarise_at(.x, vars(contains("second")),
funs(se_second = sd(.)/sqrt(n()) )))
Why is n()
not doing what I think it should do in this situation?