I would like to be ebale to use dplyr's enquo
within lapply
call while jumping through Spark table columns.
lapply(tbl_vars(sprkTbl),
function(col_nme) {
print(col_nme)
# Enquoe column name
quo_col_nme <- enquo(col_nme)
print(quo_col_nme)
sprkTbl %>%
select(!!quo_col_nme) %>%
# do stuff
collect -> dta_res
}) -> l_res
However, when I try to run this code I keep on getting error:
Error in
(function (x, strict = TRUE)
: the argument has already been evaluated
I've isolated the error to enquo
:
>> lapply(tbl_vars(sprkTbl),
... function(col_nme) {
... print(col_nme)
... # Enquoe column name
... quo_col_nme <- enquo(col_nme)
... # print(quo_col_nme)
...
... # sprkTbl%>%
... # select(!!quo_col_nme) %>%
... # # do stuff
... # collect -> dta_res
... }) -> l_res
[1] "first_column_in_spark"
(and then the same error)
Error in
(function (x, strict = TRUE)
: the argument has already been evaluated
I want to understand why enquo
can't be used like that. tbl_vars
returns an ordinary character vector, shouldn't the col_name
be a string? I would envisage for the syntax to work in the same manner as in:
mtcars %>% select(!!enquote("am")) %>% head(2)
am
Mazda RX4 1
Mazda RX4 Wag 1
but, clearly this is not the case, when called from within lapply.
Edit
leaving the sparklyr aspect on side, a better and more reproducible example can be provided:
lapply(names(mtcars),function(x) {
col_enq <- enquo(x)
mtcars %>%
select(!!col_enq) %>%
head(2)
})
produces identical error.
Desired results
The old _
-based syntax works
lapply(names(mtcars),function(x) {
# col_enq <- enquo(x)
mtcars %>%
select_(x) %>%
head(2)
})
In a word, I want to achieve the same functionality by jumping to Spark table columns and I would prefer not use deprecated select_
.