I have opened a .parquet dataset through the open_dataset
function of the arrow
package. I want to use across
to clean several numeric columns at a time. However, when I run this code:
start_numeric_cols = "sum"
sales <- sales %>% mutate(
across(starts_with(start_numeric_cols) & (!where(is.numeric)),
\(col) {replace(col, col == "NULL", 0) %>% as.numeric()}),
across(starts_with(start_numeric_cols) & (where(is.numeric)),
\(col) {replace(col, is.na(col), 0)})
)
#> Error in `across_setup()`:
#> ! Anonymous functions are not yet supported in Arrow
The error message is pretty informative, but I am wondering whether there is any way to do the same only with dplyr
verbs within across
(or another workaround without having to type each column name).