1

I have a custom function which summarises a variable. I simplified the function to illustrate my problem, i.e. it is more complex than shown below. Note that the general structure of the function should remain the same: It takes an argument for specifying which dataframe to work on (df), and an argument which variable to summarise (variable_to_test).

my_fun <- function(df, variable_to_test) {

  variable_to_test <- enquo(variable_to_test)
  new_var_name <- paste0(quo_name(variable_to_test), "_new_name")

  df %>% 
    summarise(
      !!new_var_name := sum(!!variable_to_test, na.rm = TRUE)
    ) 
}

Using an example, I can apply the function on each variable in my dataframe:

library(tidyverse)
dat <- tibble(
  variable_1 = c(1:5, NA, NA, NA, NA, NA),
  variable_2 = c(NA, NA, NA, NA, NA, 11:15)
)


> my_fun(dat, variable_1)
# A tibble: 1 x 1
   variable_1_new_name
                 <int>
1                  15


> my_fun(dat, variable_2)
# A tibble: 1 x 1
  variable_2_new_name
                <int>
1                  65

But: how can I list apply the function on all columns in the dataframe? I tried

> dat %>%
+ lapply(., my_fun)
Error in duplicate(quo) : argument "quo" is missing, with no default
Called from: duplicate(quo)

but this returns an error. I'm struggling with the fact that the function takes an argument for both the dataframe to work on, and the variable to summarise. Note that I'd like to keep this structure - I find it more elegant to pass the name of the dataframe into the function instead of just giving the function the variable name and "hard-code" the data frame into the function body. Does anybody have a good idea how to lapply() the function?

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
piptoma
  • 754
  • 1
  • 8
  • 19
  • 1
    Do you need a `dplyr` solution, or does base R fit your needs? Usually you would solve this by giving the function one static and one variable input, e.g. `lapply(dat, function(x) myfun(dat, x))`. I'm not versed in `dplyr`, but maybe try `lapply(., function(x) myfun(., x))`? – LAP Aug 14 '17 at 11:54
  • I already have a base R solution. I tried to rewrite the function the `tidyeval`-way, since it enhances readability of the function body. So yes, I need a`tidyeval`-solution :) – piptoma Aug 14 '17 at 11:58

2 Answers2

2

Oh, I think you're just mapping over the wrong thing. For the tidyverse solution I would try:

map(dat, ~my_fun(dat, .))

What this does is map over the column names and plug in the column to the ..

piptoma
  • 754
  • 1
  • 8
  • 19
Shorpy
  • 1,549
  • 13
  • 28
1

You're working at the wrong level. If you map a function over a data frame, then this function should take a column. The problem here is that the function my_fun() expects a data frame rather than a column.

You need to find some other way of solving the problem. One solution is to use the mappers provided by dplyr:

dat %>%
  summarise_all(sum, na.rm = TRUE) %>%
  rename_all(paste0, "_new_name")

You could equivalently use a combination of map() and set_names() from purrr.

dat %>%
  map_df(sum, na.rm = TRUE) %>%
  set_names(paste0, "_new_name")
Lionel Henry
  • 6,652
  • 27
  • 33