2

I would like to be ebale to use dplyr's enquo within lapply call while jumping through Spark table columns.

lapply(tbl_vars(sprkTbl),
       function(col_nme) {
           print(col_nme)
           # Enquoe column name
           quo_col_nme <- enquo(col_nme)
           print(quo_col_nme)

           sprkTbl %>%
               select(!!quo_col_nme) %>% 
               # do stuff
               collect -> dta_res
       }) -> l_res

However, when I try to run this code I keep on getting error:

Error in (function (x, strict = TRUE) : the argument has already been evaluated

I've isolated the error to enquo:

>> lapply(tbl_vars(sprkTbl),
...        function(col_nme) {
...            print(col_nme)
...            # Enquoe column name
...            quo_col_nme <- enquo(col_nme)
...            # print(quo_col_nme)
...            
...            # sprkTbl%>%
...            #     select(!!quo_col_nme) %>% 
...            #     # do stuff
...            #     collect -> dta_res
...        }) -> l_res
[1] "first_column_in_spark"

(and then the same error)

Error in (function (x, strict = TRUE) : the argument has already been evaluated

I want to understand why enquo can't be used like that. tbl_vars returns an ordinary character vector, shouldn't the col_name be a string? I would envisage for the syntax to work in the same manner as in:

mtcars %>% select(!!enquote("am")) %>% head(2)
              am
Mazda RX4      1
Mazda RX4 Wag  1

but, clearly this is not the case, when called from within lapply.


Edit

leaving the sparklyr aspect on side, a better and more reproducible example can be provided:

lapply(names(mtcars),function(x) {
    col_enq <- enquo(x)
    mtcars %>% 
        select(!!col_enq) %>% 
        head(2)
})

produces identical error.

Desired results

The old _-based syntax works

lapply(names(mtcars),function(x) {
    # col_enq <- enquo(x)
    mtcars %>% 
        select_(x) %>% 
        head(2)
})

In a word, I want to achieve the same functionality by jumping to Spark table columns and I would prefer not use deprecated select_.

Konrad
  • 17,740
  • 16
  • 106
  • 167

1 Answers1

1

Do I understand your question correctly that you are interested in this result? Or are you bound to use enquo instead of quo?

library(dplyr)

lapply(names(mtcars),function(x) {
  col_enq <- quo(x)
  mtcars %>% 
    select(!!col_enq) %>% 
    head(2)
})
#> [[1]]
#>               mpg
#> Mazda RX4      21
#> Mazda RX4 Wag  21
#> 
#> [[2]]
#>               cyl
#> Mazda RX4       6
#> Mazda RX4 Wag   6
#> 
#> [[3]]
#>               disp
#> Mazda RX4      160
#> Mazda RX4 Wag  160
#> 
#> [[4]]
#>                hp
#> Mazda RX4     110
#> Mazda RX4 Wag 110
#> 
#> [[5]]
#>               drat
#> Mazda RX4      3.9
#> Mazda RX4 Wag  3.9
#> 
#> [[6]]
#>                  wt
#> Mazda RX4     2.620
#> Mazda RX4 Wag 2.875
#> 
#> [[7]]
#>                qsec
#> Mazda RX4     16.46
#> Mazda RX4 Wag 17.02
#> 
#> [[8]]
#>               vs
#> Mazda RX4      0
#> Mazda RX4 Wag  0
#> 
#> [[9]]
#>               am
#> Mazda RX4      1
#> Mazda RX4 Wag  1
#> 
#> [[10]]
#>               gear
#> Mazda RX4        4
#> Mazda RX4 Wag    4
#> 
#> [[11]]
#>               carb
#> Mazda RX4        4
#> Mazda RX4 Wag    4
David
  • 9,216
  • 4
  • 45
  • 78
  • Correct, with the caveat that ultimately I want to apply this to a spark table but this a side point now. So why `quo` not `enquo`? – Konrad Aug 11 '17 at 13:31
  • 2
    Tbh, I am always confused which one to use, I end up trying both and using the one that works... – David Aug 11 '17 at 13:44
  • Typing `rlang::quo` at the console confuses me even further... `function (expr) { enquo(expr) }` ... or https://github.com/tidyverse/rlang/blob/master/R/quo.R#L180 – Aurèle Aug 11 '17 at 13:55
  • 3
    @Konrad I think you use `quo()` if the variable passed to the function is quoted and `enquo()` if not. In your example, one can imagine that what lapply is doing is calling the function as `function("mpg")` rather than (no quotes) `function(mpg)`. If it was doing the latter, then you'd have to use `enquo()`. – hugot Aug 11 '17 at 13:56
  • 1
    @Aurèle `quo()` creates an expression in place while `enquo()` creates an expression that comes from one level up. This is why `quo()` is a simple wrapper around `enquo()`. @hugo If the variable is alread quoted you shouldn't quote it again. You should just unquote it. But that isn't the case here, `x` will be a string not a quoted expression. The `quo()` here is misleading and serves no useful purpose. It just delays the evaluation of `x` (a string). You should just unquote `x` without sticking it in a quosure, `select()` supports strings, e.g. `select(mtcars, "cyl")`. – Lionel Henry Nov 27 '17 at 07:13