How do I keep a character
ID variable PERSON_ID
unchanged in a recipe? I tried update_role(PERSON_ID , new_role = "id variable")
and tried excluding it from step_dummy step_dummy(all_nominal_predictors(), -all_numeric_predictors(), -all_outcomes(), -has_role(match = "id variable")
. It does not work. It still converts PERSON_ID to factor. Any suggestion?
Asked
Active
Viewed 192 times
1

poshan
- 3,069
- 5
- 20
- 30
-
1Please consider providing a small reproducible example – akrun Mar 04 '22 at 15:30
-
1The conversion from character to factor already happens when you put your data into the recipe function, before you are adding any steps. Even if you step_mutate the variable to as.character, it will still be converted to a factor – Leonhard Geisler Mar 04 '22 at 22:31
-
Thanks @LeonhardGeisler. I am trying to create a workflowset and the factor id variable is blowing up the memory. Any suggestion how to handle it? – poshan Mar 04 '22 at 22:44
1 Answers
0
This seems to be a confusing one. Following the recipe function documentation, step_factor2string should convert factors to strings.
However, when you glimpse at the recipe it states "fct" for PERSON_ID. On the other side an error appears, if you set strings_as_factors to FALSE, stating that PERSON_ID is not a factor:
library(tibble)
library(tidymodels)
data_input <- tibble(target = rep(1,9),
num_var = rep(2,9),
char = c(rep("a", 6),rep("b",3)),
PERSON_ID = as.character(c(rep("W",3),rep("D",6))),
logi = rep(c(TRUE,FALSE,FALSE),3),
fac = as.factor(c(rep("1",6),rep("2",3)))
)
recipe_spec <- recipe(target ~ ., data = data_input) %>%
update_role("PERSON_ID", new_role = "id variable") %>%
step_dummy(all_nominal_predictors(),-all_numeric_predictors(),-all_outcomes(),-has_role(match = "id variable")) %>%
step_factor2string(PERSON_ID)
recipe_spec %>% prep() %>% juice() %>% glimpse()
recipe_spec %>% prep(strings_as_factors = FALSE) %>% juice() %>% glimpse()

Leonhard Geisler
- 506
- 3
- 15