I estimate a model using a classif.rpart learner. The estimation is embedded in a nested resampling. When I look at the inner tuning results using mlr3tuning::extract_inner_tuning_results(bmr), the values for minbucket and minsplit are decimal numbers (example: minbucket 0.13 or 2.81, minsplit 2.35 or 4.61). From my understanding, both indicate numbers of observations, so I thought it should be integers. Do you have an explanation for why these numbers are decimal? Thank you in advance!
Edit: I cannot post the original code I use, but this code shows the same behaviour, using a task from the mlr3 package.
library(mlr3)
library(progressr)
# choose task
sonar <- tsk("sonar")
# choose learners
l_rpart <- lrn("classif.rpart")
l_ranger <- lrn("classif.ranger")
# add search spaces to learners
l_rpart$param_set$values <- lts("classif.rpart.default")$values
l_ranger$param_set$values <- lts("classif.ranger.default")$values
# add fallback learners
l_rpart$fallback = lrn("classif.featureless")
l_ranger$fallback = lrn("classif.featureless")
# robustify
rpart_graph <- mlr3pipelines::pipeline_robustify(task = sonar, learner = l_rpart) %>>% mlr3pipelines::po("learner", l_rpart)
rpart_learner <- mlr3::as_learner(rpart_graph)
ranger_graph <- mlr3pipelines::pipeline_robustify(task = sonar, learner = l_ranger) %>>% mlr3pipelines::po("learner", l_ranger)
ranger_learner <- mlr3::as_learner(ranger_graph)
# create autotuners
at_rpart <- mlr3tuning::auto_tuner(
method = mlr3verse::tnr("random_search"),
learner = rpart_learner,
resampling = mlr3::rsmp("cv", folds = 4),
measure = mlr3::msr("classif.acc", id = "acc"),
term_time = 1 * 60,
term_evals = 4)
at_ranger <- mlr3tuning::auto_tuner(
method = mlr3verse::tnr("random_search"),
learner = ranger_learner,
resampling = mlr3::rsmp("cv", folds = 4),
measure = mlr3::msr("classif.acc", id = "acc"),
term_time = 1 * 60,
term_evals = 4)
# create the benchmark design
design = benchmark_grid(tasks = sonar,
learners = list(at_rpart, at_ranger),
resamplings = mlr3::rsmp("cv", folds = 3))
# run the benchmark experiment
bmr = with_progress(benchmark(design,
store_models = TRUE))
# show inner tuning results
mlr3tuning::extract_inner_tuning_results(bmr)
The beginning of the output looks like this, where you can see that classif.rpart.minsplit and classif.rpart.minbucket are decimals instead of integers as I would expect.:
mlr3tuning::extract_inner_tuning_results(bmr)
experiment iteration classif.rpart.minsplit classif.rpart.minbucket classif.rpart.cp classif.ranger.mtry.ratio classif.ranger.replace
1: 1 1 2.834898 2.9295168 -9.089721 NA NA
2: 1 2 4.515618 0.5116199 -3.805193 NA NA
3: 1 3 3.484092 2.6164599 -3.131506 NA NA
4: 2 1 NA NA NA 0.2700584 FALSE
5: 2 2 NA NA NA 0.1032228 TRUE
6: 2 3 NA NA NA 0.3427129 FALSE
Thank you again for looking into it.