0

Problem:

I'm following a tutorial from Julia Silge (link here) on using tidymodels and recipes. I can get most of the way through without a problem but when I come to calling the fit_resamples() function I get the error: Error: The first argument to [fit_resamples()] should be either a model or workflow.

I'm copying the code in the tutorial character for character, and everything runs fine up to and including printing out validation_splits. But as soon as I call fit_resamples() I get the error above (link to relevant part of tutorial). If useful, the output of rlang::last_error() is:

<error/rlang_error>

The first argument to [fit_resamples()] should be either a model or workflow.
Backtrace:
 
     1. tune::fit_resamples(...)
     2. tune:::fit_resamples.default(...)

Does anyone know what's going on here? And how I can resolve it? My understanding is that the first argument I pass to fit_resamples() is a model, i.e. character ~ ., and i've passed this same model to other functions earlier in the script without issue. See below for code (and data) that leads to the error on my machine, and my sessionInfo().

Reproducible example:

library(tidyverse)

## Bring in data
hotels <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv')

hotel_stays <- hotels %>% 
  filter(is_canceled == 0) %>% 
  mutate(children = case_when(children + babies > 0 ~ 'children',
                              TRUE ~ 'none'),
         required_car_parking_spaces = case_when(required_car_parking_spaces > 0 ~ 'parking', 
                                                 TRUE ~ 'none')) %>% 
  select(-is_canceled, -reservation_status, -babies)

hotels_df <- hotel_stays %>% 
  select(children, hotel, arrival_date_month, meal, adr, adults, 
         required_car_parking_spaces, total_of_special_requests, 
         stays_in_week_nights, stays_in_weekend_nights) %>% 
  mutate_if(is.character, factor)

## Build models
library(tidymodels)

set.seed(1234)
hotel_split <- initial_split(hotels_df)
hotel_train <- training(hotel_split)
hotel_test <- testing(hotel_split)

hotel_rec <- recipe(children ~ ., data = hotel_train) %>% 
  step_downsample(children) %>% 
  step_dummy(all_nominal(), -all_outcomes()) %>% 
  step_zv(all_numeric()) %>% 
  step_normalize(all_numeric()) %>% 
  prep()

test_proc <- bake(hotel_rec, new_data = hotel_test)

knn_spec <- nearest_neighbor() %>% 
  set_engine('kknn') %>% 
  set_mode('classification')
knn_fit <- knn_spec %>% 
  fit(children ~ ., 
      data=juice(hotel_rec))
knn_fit

## Evaluate models
set.seed(1234)
validation_splits <- mc_cv(juice(hotel_rec), prop = 0.9, strata = children)
validation_splits

## This is where I get the error
knn_res <- fit_resamples(
  children ~ ., 
  knn_spec,
  validation_splits,
  control = control_resamples(save_pred = TRUE)
)

My sessionInfo():

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GGally_2.1.2.9000  skimr_2.1.3        silgelib_0.1.1     forcats_0.5.1     
 [5] stringr_1.4.0      readr_1.4.0        tidyverse_1.3.1    knitr_1.33        
 [9] yardstick_0.0.8    workflowsets_0.0.2 workflows_0.2.2    tune_0.1.5        
[13] tidyr_1.1.3        tibble_3.1.2       rsample_0.1.0      recipes_0.1.16    
[17] purrr_0.3.4        parsnip_0.1.6      modeldata_0.1.0    infer_0.5.4       
[21] ggplot2_3.3.5      dplyr_1.0.7        dials_0.0.9        scales_1.1.1      
[25] broom_0.7.6        tidymodels_0.1.3  

loaded via a namespace (and not attached):
 [1] colorspace_2.0-1   ellipsis_0.3.2     class_7.3-19       base64enc_0.1-3   
 [5] fs_1.5.0           rstudioapi_0.13    listenv_0.8.0      furrr_0.2.3       
 [9] farver_2.1.0       prodlim_2019.11.13 fansi_0.5.0        lubridate_1.7.10  
[13] xml2_1.3.2         codetools_0.2-18   splines_4.1.0      jsonlite_1.7.2    
[17] pROC_1.17.0.1      dbplyr_2.1.1       shiny_1.6.0        compiler_4.1.0    
[21] httr_1.4.2         backports_1.2.1    assertthat_0.2.1   Matrix_1.3-3      
[25] fastmap_1.1.0      cli_2.5.0          later_1.2.0        htmltools_0.5.1.1 
[29] prettyunits_1.1.1  tools_4.1.0        igraph_1.2.6       gtable_0.3.0      
[33] glue_1.4.2         Rcpp_1.0.6         cellranger_1.1.0   DiceDesign_1.9    
[37] vctrs_0.3.8        iterators_1.0.13   timeDate_3043.102  gower_0.2.2       
[41] xfun_0.23          globals_0.14.0     rvest_1.0.0        mime_0.10         
[45] lifecycle_1.0.0    kknn_1.3.1         future_1.21.0      MASS_7.3-54       
[49] ipred_0.9-11       hms_1.1.0          promises_1.2.0.1   parallel_4.1.0    
[53] RColorBrewer_1.1-2 yaml_2.2.1         curl_4.3.1         rpart_4.1-15      
[57] reshape_0.8.8      stringi_1.6.2      foreach_1.5.1      lhs_1.1.1         
[61] lava_1.6.9         repr_1.1.3         rlang_0.4.11       pkgconfig_2.0.3   
[65] evaluate_0.14      lattice_0.20-44    htmlwidgets_1.5.3  labeling_0.4.2    
[69] tidyselect_1.1.1   parallelly_1.26.0  plyr_1.8.6         magrittr_2.0.1    
[73] R6_2.5.0           generics_0.1.0     DBI_1.1.1          pillar_1.6.1      
[77] haven_2.4.1        withr_2.4.2        survival_3.2-11    nnet_7.3-16       
[81] modelr_0.1.8       crayon_1.4.1       utf8_1.2.1         rmarkdown_2.8     
[85] progress_1.2.2     grid_4.1.0         readxl_1.3.1       reprex_2.0.0      
[89] digest_0.6.27      xtable_1.8-4       httpuv_1.6.1       GPfit_1.0-8       
[93] munsell_0.5.0 
C.Robin
  • 1,085
  • 1
  • 10
  • 23

1 Answers1

1

The blog post you are looking at is fairly old, and there was a change to tune a while back so that you should now put either a workflow or a model first. Hence the error message:

The first argument to [fit_resamples()] should be either a model or workflow.

The fix is to put your model or workflow as the first argument, like this:

knn_res <- fit_resamples(
  knn_spec,
  children ~ ., 
  validation_splits,
  control = control_resamples(save_pred = TRUE)
)
Julia Silge
  • 10,848
  • 2
  • 40
  • 48
  • Ok fantastic. Thanks! I didn't fully understand what a model or workflow was -- was under the impression that `children ~ .` was in fact a model of regressing children on all the covariates. Really appreciate your help (and thanks again for a great tutorial) – C.Robin Jun 26 '21 at 19:49
  • 1
    Ah, you might find it helpful to walk through the articles [here](https://www.tidymodels.org/start/) to clarify what the components of tidymodels are. – Julia Silge Jun 26 '21 at 21:35