Problem:
I'm following a tutorial from Julia Silge (link here) on using tidymodels and recipes. I can get most of the way through without a problem but when I come to calling the fit_resamples()
function I get the error: Error: The first argument to [fit_resamples()] should be either a model or workflow.
I'm copying the code in the tutorial character for character, and everything runs fine up to and including printing out validation_splits
. But as soon as I call fit_resamples()
I get the error above (link to relevant part of tutorial). If useful, the output of rlang::last_error()
is:
<error/rlang_error>
The first argument to [fit_resamples()] should be either a model or workflow.
Backtrace:
1. tune::fit_resamples(...)
2. tune:::fit_resamples.default(...)
Does anyone know what's going on here? And how I can resolve it? My understanding is that the first argument I pass to fit_resamples()
is a model, i.e. character ~ .
, and i've passed this same model to other functions earlier in the script without issue. See below for code (and data) that leads to the error on my machine, and my sessionInfo().
Reproducible example:
library(tidyverse)
## Bring in data
hotels <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv')
hotel_stays <- hotels %>%
filter(is_canceled == 0) %>%
mutate(children = case_when(children + babies > 0 ~ 'children',
TRUE ~ 'none'),
required_car_parking_spaces = case_when(required_car_parking_spaces > 0 ~ 'parking',
TRUE ~ 'none')) %>%
select(-is_canceled, -reservation_status, -babies)
hotels_df <- hotel_stays %>%
select(children, hotel, arrival_date_month, meal, adr, adults,
required_car_parking_spaces, total_of_special_requests,
stays_in_week_nights, stays_in_weekend_nights) %>%
mutate_if(is.character, factor)
## Build models
library(tidymodels)
set.seed(1234)
hotel_split <- initial_split(hotels_df)
hotel_train <- training(hotel_split)
hotel_test <- testing(hotel_split)
hotel_rec <- recipe(children ~ ., data = hotel_train) %>%
step_downsample(children) %>%
step_dummy(all_nominal(), -all_outcomes()) %>%
step_zv(all_numeric()) %>%
step_normalize(all_numeric()) %>%
prep()
test_proc <- bake(hotel_rec, new_data = hotel_test)
knn_spec <- nearest_neighbor() %>%
set_engine('kknn') %>%
set_mode('classification')
knn_fit <- knn_spec %>%
fit(children ~ .,
data=juice(hotel_rec))
knn_fit
## Evaluate models
set.seed(1234)
validation_splits <- mc_cv(juice(hotel_rec), prop = 0.9, strata = children)
validation_splits
## This is where I get the error
knn_res <- fit_resamples(
children ~ .,
knn_spec,
validation_splits,
control = control_resamples(save_pred = TRUE)
)
My sessionInfo()
:
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GGally_2.1.2.9000 skimr_2.1.3 silgelib_0.1.1 forcats_0.5.1
[5] stringr_1.4.0 readr_1.4.0 tidyverse_1.3.1 knitr_1.33
[9] yardstick_0.0.8 workflowsets_0.0.2 workflows_0.2.2 tune_0.1.5
[13] tidyr_1.1.3 tibble_3.1.2 rsample_0.1.0 recipes_0.1.16
[17] purrr_0.3.4 parsnip_0.1.6 modeldata_0.1.0 infer_0.5.4
[21] ggplot2_3.3.5 dplyr_1.0.7 dials_0.0.9 scales_1.1.1
[25] broom_0.7.6 tidymodels_0.1.3
loaded via a namespace (and not attached):
[1] colorspace_2.0-1 ellipsis_0.3.2 class_7.3-19 base64enc_0.1-3
[5] fs_1.5.0 rstudioapi_0.13 listenv_0.8.0 furrr_0.2.3
[9] farver_2.1.0 prodlim_2019.11.13 fansi_0.5.0 lubridate_1.7.10
[13] xml2_1.3.2 codetools_0.2-18 splines_4.1.0 jsonlite_1.7.2
[17] pROC_1.17.0.1 dbplyr_2.1.1 shiny_1.6.0 compiler_4.1.0
[21] httr_1.4.2 backports_1.2.1 assertthat_0.2.1 Matrix_1.3-3
[25] fastmap_1.1.0 cli_2.5.0 later_1.2.0 htmltools_0.5.1.1
[29] prettyunits_1.1.1 tools_4.1.0 igraph_1.2.6 gtable_0.3.0
[33] glue_1.4.2 Rcpp_1.0.6 cellranger_1.1.0 DiceDesign_1.9
[37] vctrs_0.3.8 iterators_1.0.13 timeDate_3043.102 gower_0.2.2
[41] xfun_0.23 globals_0.14.0 rvest_1.0.0 mime_0.10
[45] lifecycle_1.0.0 kknn_1.3.1 future_1.21.0 MASS_7.3-54
[49] ipred_0.9-11 hms_1.1.0 promises_1.2.0.1 parallel_4.1.0
[53] RColorBrewer_1.1-2 yaml_2.2.1 curl_4.3.1 rpart_4.1-15
[57] reshape_0.8.8 stringi_1.6.2 foreach_1.5.1 lhs_1.1.1
[61] lava_1.6.9 repr_1.1.3 rlang_0.4.11 pkgconfig_2.0.3
[65] evaluate_0.14 lattice_0.20-44 htmlwidgets_1.5.3 labeling_0.4.2
[69] tidyselect_1.1.1 parallelly_1.26.0 plyr_1.8.6 magrittr_2.0.1
[73] R6_2.5.0 generics_0.1.0 DBI_1.1.1 pillar_1.6.1
[77] haven_2.4.1 withr_2.4.2 survival_3.2-11 nnet_7.3-16
[81] modelr_0.1.8 crayon_1.4.1 utf8_1.2.1 rmarkdown_2.8
[85] progress_1.2.2 grid_4.1.0 readxl_1.3.1 reprex_2.0.0
[89] digest_0.6.27 xtable_1.8-4 httpuv_1.6.1 GPfit_1.0-8
[93] munsell_0.5.0