3

I'm taking a first look at tidymodels. My alternative for the current project would be non-tidyfied ranger. On a test run, classification random forest with tidymodels using the ranger engine is much slower than hand-held ranger (approximately ten times slower) when run on the classic iris dataset. Why is that?

library(tidymodels)
library(ranger)

# Make example data
data("iris")
mydata <- iris[sample(1:nrow(iris), 600, replace=T),]

# Recipe 
myrecipe <- mydata %>% recipe( Species ~ . )

# Setting a Ranger RF model
myRF <- rand_forest( trees = 300, mtry = 3, min_n = 1) %>% 
  set_mode("classification") %>% 
  set_engine("ranger")

# Setting a workflow
myworkflow <- workflow() %>% 
  add_model(myRF) %>% 
  add_recipe(myrecipe)

# Compare base ranger and tidy setup

time <- Sys.time()
fit_ranger <- ranger( Species ~ . , data = mydata, probability = T,
                     mtry = 3, num.trees = 300, min.node.size = 1)
ranger_time <- difftime( Sys.time(), time, "secs")


time <- Sys.time()
fit_tidy <- myworkflow %>% 
  fit(data= mydata)
tidy_time <- difftime( Sys.time(), time, "secs")

tidy_time
ranger_time
lambdatau
  • 31
  • 3
  • 3
    If you run `profvis::profvis()` on both you get a pretty good idea. I have not used `tidymodels`, but it looks like it is doing a lot of checks and set-up. The amount of time spend on `ranger::ranger()` is the approximately the same. I am guessing that the difference would become smaller in magnitude as the data gets larger (but `ranger::ranger()` will probably always be faster) – Andrew Oct 20 '20 at 09:37
  • 1
    That's a nice tool, thank you. If the losses are comparatively manageable at larger data sizes, tidymodels might be worth it for the sheer sleekness and elegance of it. Will look at that. – lambdatau Oct 20 '20 at 09:50
  • 1
    We've done benchmarks as well, and the overhead for tidymodels functions does not scale with the size of the data. You don't have overhead costs much larger than this when using much larger datasets. Feel free to test it out yourself! – Julia Silge Oct 20 '20 at 19:51

0 Answers0