2

Okay, this is something that feels like it should be relatively easy, but although I have tried literally dozens of approaches with quote, eval, substitute, enquote, parse, summarize_ etc... I haven't gotten it to work. Basically I am trying to calculate something like this - but with a variable expression for the summarise argument:

mtcars %>% group_by(cyl) %>% summarise(wt=mean(wt),hp=mean(hp))

yielding:

# A tibble: 3 × 3
    cyl       wt        hp   
    <dbl>    <dbl>     <dbl> 
1     4 2.285727  82.63636 
2     6 3.117143 122.28571 
3     8 3.999214 209.21429

One of the things I tried was:

  x2 <- "wt=mean(wt),hp=mean(hp)"
  mtcars %>% group_by(cyl) %>% summarise(eval(parse(text=x2)))

yielding:

Error in eval(substitute(expr), envir, enclos) : 
  <text>:1:12: unexpected ','
1: wt=mean(wt),

But leaving away the second argument (",hp=mean(hp") gets you no further:

> x2 <- "wt=mean(wt)"
> mtcars %>% group_by(cyl) %>% summarise(eval(parse(text=x2)))
Error in eval(substitute(expr), envir, enclos) : object 'wt' not found

I will spare you all the other things I tried - I am clearly missing something about how expressions get handled in function arguments.

So what is the proper approach here? Keeping in mind I really want something like this in the end:

getdf <- function(df,sumarg){
  df %>% group_by(cyl) %>% summarise(sumarg)
  df
}

Also not sure what kind of tag I should use for this kind of query in the R world. Metaprogramming?

Axeman
  • 32,068
  • 8
  • 81
  • 94
Mike Wise
  • 22,131
  • 8
  • 81
  • 104
  • You probably need to use the standard evaluation version `summarise_` – talat Mar 17 '17 at 10:17
  • 1
    One option is using a named vector (or list) like so: `x2 <- c(wt= "mean(wt)", hp = "mean(hp)")` and then `mtcars %>% group_by(cyl) %>% summarise_(.dots = x2)` – talat Mar 17 '17 at 10:23
  • I tried things with `summarise_` too. This shoudl be easy, right? – Mike Wise Mar 17 '17 at 10:26
  • You aren't specifying what part of the expression needs to be dynamic. Did you read `vignette("nse")`? What values do you want `sumarg` to take? – Axeman Mar 17 '17 at 10:37
  • Ah, `nse` was the term I was looking for. Ideally sumarg would be something like "wt=mean(wt),hp=mean(hp)". – Mike Wise Mar 17 '17 at 10:43

1 Answers1

4

For maximum flexibility I would use a ... argument, capture those dots use lazyeval, and then pass to summarise_:

getdf <- function(df, ...){ 
    df %>% group_by(cyl) %>% summarise_(.dots = lazyeval::lazy_dots(...)) 
}

Then you can directly do:

getdf(mtcars, wt = mean(wt), hp = mean(hp))
# A tibble: 3 × 3
    cyl       wt        hp
  <dbl>    <dbl>     <dbl>
1     4 2.285727  82.63636
2     6 3.117143 122.28571
3     8 3.999214 209.21429

One way to do it without ..., is to pass arguments in a list, although you will need to use formulas or quoting. E.g.:

getdf2 <- function(df, args){ 
    dots <- lazyeval::as.lazy_dots(args)
    df %>% group_by(cyl) %>% summarise_(.dots = dots) 
}

And use as:

getdf(mtcars, list(wt = ~mean(wt), hp = ~mean(hp)))

or

getdf(mtcars, list(wt = "mean(wt)", hp = "mean(hp)"))
Axeman
  • 32,068
  • 8
  • 81
  • 94
  • Yep, that does it pretty well. Thanks for that. Interesting you beat dd to the punch - pretty impressive. – Mike Wise Mar 17 '17 at 10:46
  • Would be curious to know if you could do it with a string? That has some advantages. dots can only be used for one purpose, but you can have a lot of strings. – Mike Wise Mar 17 '17 at 11:09