I am trying to write a function that operates on a data.frame and will accept dplyr-style arguments, i.e. column names that are not quoted by using dplyr's pronous (or whatever we call it).
But I have encountered a problem when using !!
inside a bracketed expression (see below the examples).
Examples:
First a data.frame:
df <- data.frame(gah=c('a','a','a','a','b','b','b','b'),
fruit=c('apple','apple','apple','banana','banana','banana','dog','dog'),
val=1:8,
sss=-7:0,
mean=0)
First function, it averages a fixed column (val
) as well as a column as given by the argument. It does not modify the grouping:
a_func <- function(df, value=val) {
value_ = enquo(value)
df %>% summarise(mean=mean(!!value_), mean_val=mean(val), n=n())
}
a_func(df, sss)
df %>% group_by(gah) %>% a_func()
df %>% group_by(gah) %>% a_func(sss)
df %>% group_by(gah, fruit) %>% a_func
This works as expected.
The next function adds a grouping variable before using summarise
:
c_func <- function(df, gr) {
gr_ = enquo(gr)
df %>% group_by(!!gr_) %>% summarise(n=n())
}
c_func(df, gah)
c_func(df, gr=gah)
c_func(df, fruit)
This also works as expected.
Next, I combine the two. That should be doable - and it in fact is! Praise the Holy Kitten!
b_func <- function(df, value=val, gr=NA) {
value_ = enquo(value)
gr_ = enquo(gr)
df %>% group_by(!!gr_, add=TRUE) %>%
summarise(mean=mean(!!value_), mean_val=mean(val))
}
b_func(df, sss)
df %>% group_by(gah) %>% b_func(gr=fruit)
b_func(df, gr=fruit)
df %>% group_by(gah) %>% b_func(sss, fruit)
It clearly works as expected, albeit, with the optional argument gr
I would like to only add the grouping variable when gr
is not NA
.
This is were it breaks:
Adding a conditional to only do the grouping when gr
is not NA
,
looking for the quosure from within the bracket somehow does not work.
d_func <- function(df, value=val, gr=NA) {
value_ = enquo(value)
gr_ = enquo(gr)
if (!is.na(gr)) {
df <- df %>% group_by(!!gr_)
}
df %>%
summarise(mean=mean(!!value_), mean_val=mean(val))
}
d_func(df, sss) # works
df %>% group_by(gah) %>% d_func(gr=fruit)
# Error in d_func(., gr = fruit) : object 'fruit' not found
d_func(df, gr=fruit)
# Error in d_func(df, gr = fruit) : object 'fruit' not found
df %>% group_by(gah) %>% d_func(sss, fruit)
# Error in d_func(., sss, fruit) : object 'fruit' not found
It is clearly due to !!gr_
being called within the scope of additional brackets; remove the if and it's brackets and d_func
is equivalent to b_func
, and both groups by a column NA
.
I do not understand why this occurs or how to solve this.
Updated with sessionInfo
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252
[4] LC_NUMERIC=C LC_TIME=Danish_Denmark.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rlang_0.2.0 bindrcpp_0.2.2 lemon_0.4.0 tidyr_0.8.0 magrittr_1.5
[6] dplyr_0.7.4 odbc_1.1.5 RevoUtils_10.0.9 RevoUtilsMath_10.0.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.16 pillar_1.2.1 compiler_3.4.4 plyr_1.8.4 bindr_0.1.1 tools_3.4.4
[7] bit_1.1-12 tibble_1.4.2 gtable_0.2.0 lattice_0.20-35 pkgconfig_2.0.1 openxlsx_4.0.17
[13] cli_1.0.0 rstudioapi_0.7 DBI_0.8 yaml_2.1.18 gridExtra_2.3 knitr_1.20
[19] hms_0.4.2 bit64_0.9-7 grid_3.4.4 tidyselect_0.2.4 glue_1.2.0 R6_2.2.2
[25] ggplot2_2.2.1.9000 purrr_0.2.4 blob_1.1.1 scales_0.5.0 assertthat_0.2.0 colorspace_1.3-2
[31] utf8_1.1.3 lazyeval_0.2.1 munsell_0.4.3 crayon_1.3.4