0

I would like to use a vector with column names for a variety of step functions in the tidymodels recipe package. My intuition was simply to use (the prep and juice just used here for illustration):

library(tidymodels)
library(modeldata)
data(biomass)

remove_vector <- c("oxygen","nitrogen")

test_recipe <- recipe(HHV ~ .,data = biomass) %>%
  step_rm(remove_vector)

test_recipe %>% 
  prep %>% 
  juice %>% 
  head

But this returns the warning:

Note: Using an external vector in selections is ambiguous.
i Use `all_of(remove_vector)` instead of `remove_vector` to silence this message.
i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
This message is displayed once per session.

This, of course, concerns me (I want to make sure I code without coming across error messages), but I still get the outcome I desire.

However, when I follow the error message and use the following with all_of:

test_recipe <- recipe(HHV ~ .,data = biomass) %>%
  step_rm(all_of(remove_vector))

test_recipe %>% 
  prep %>% 
  juice %>% 
  head

I get the error message:

Error: Not all functions are allowed in step function selectors (e.g. all_of). See ?selections.

In the ?selections, I don't seem to find reference to the exact (seemingly simple) problem that I have.

Any ideas? Many thanks!

UseR10085
  • 7,120
  • 3
  • 24
  • 54
Moritz Schwarz
  • 2,019
  • 2
  • 15
  • 33
  • 2
    This is not your fault because the error message is confusing, but recipes doesn't yet understand all tidyselect selectors. Looks like you [already found what is currently implemented](https://recipes.tidymodels.org/reference/selections.html), and we are working on making this cohesive. For sure this is not a great combination of error messages! – Julia Silge May 15 '20 at 03:57
  • Great, thanks for confirming Julia! Keep up the good work on tidymodels, looking forward to what's coming soon! :) – Moritz Schwarz May 15 '20 at 14:17

1 Answers1

3

If you use quasiquotation you won't get a warning:

library(tidymodels)
library(modeldata)
data(biomass)

remove_vector <- c("oxygen", "nitrogen")

test_recipe <- recipe(HHV ~ .,data = biomass) %>%
  step_rm(!!!syms(remove_vector))

test_recipe %>% 
  prep %>% 
  juice %>% 
  head

More on the warning. It can happen that you name vector the same as one of your column names. For example:

oxygen <- c("oxygen","nitrogen")

test_recipe <- recipe(HHV ~ .,data = biomass) %>%
  step_rm(oxygen)

This will remove only oxygen column. However, if you use !!!syms(oxygen), both columns will be removed.

mihagazvoda
  • 1,057
  • 13
  • 23
  • This works, thanks! Would you be able to comment why this would need three ```!!!``` - normally I would have guessed it would just be the simple ```!```. – Moritz Schwarz May 15 '20 at 14:18
  • I don't think you can use a single `!` for quasiquotation. It's `not` operator. You can use bang bang (`!!`) to unquote one argument and `!!!` for multiple arguments. More on that in [Hadley's book](https://adv-r.hadley.nz/quasiquotation.html#unquoting). – mihagazvoda May 15 '20 at 20:11