I was reading a tutorial for tidymodels and came across the following code block:
lr_recipe <-
recipe(children ~ ., data = hotel_other) %>%
step_date(arrival_date) %>%
step_holiday(arrival_date, holidays = holidays) %>%
step_rm(arrival_date) %>%
step_dummy(all_nominal_predictors()) %>%
step_zv(all_predictors()) %>%
step_normalize(all_predictors())
( This is the source of the code: https://www.tidymodels.org/start/case-study/#first-model )
Basically, the code lists a set of pre-processing operations on predictors that are stored in a recipe
object. Now, my question arises from the following: first, in step_dummy(all_nominal_predictors())
one-hot encoding is performed on categorical predictors. Then, in a following step, step_normalize(all_predictors())
applies centering and scaling to all predictors (therefore also on the encoded categorical ones. I am used to train models directly with one-hot encoded categorical predictors, without further processing them through a normalizing step.
What is the advantage of normalizing one-hot encoded predictors? Also, how does it affect the interpretability of the model when predictions are done?
Thanks for any clarification.