I have been trying to use mlr3 to do some hyperparameter tuning for xgboost. I want to compare three different models:
- xgboost tuned over just the alpha hyperparameter
- xgboost tuned over alpha and lambda hyperparameters
- xgboost tuned over alpha, lambda, and maxdepth hyperparameters.
After reading the mlr3 book, I thought that using AutoTuner for the nested resampling and benchmarking would be the best way to go about doing this. Here is what I have tried:
task_mpcr <- TaskRegr$new(id = "mpcr", backend = data.numeric, target = "n_reads")
measure <- msr("poisson_loss")
xgb_learn <- lrn("regr.xgboost")
set.seed(103)
fivefold.cv = rsmp("cv", folds = 5)
param.list <- list( alpha = p_dbl(lower = 0.001, upper = 100, logscale = TRUE),
lambda = p_dbl(lower = 0.001, upper = 100, logscale = TRUE),
max_depth = p_int(lower = 2, upper = 10)
)
model.list <- list()
for(model.i in 1:length(param.list)){
param.list.subset <- param.list[1:model.i]
search_space <- do.call(ps, param.list.subset)
model.list[[model.i]] <- AutoTuner$new(
learner = xgb_learn,
resampling = fivefold.cv,
measure = measure,
search_space = search_space,
terminator = trm("none"),
tuner = tnr("grid_search", resolution = 10),
store_tuning_instance = TRUE
)
}
grid <- benchmark_grid(
task = task_mpcr,
learner = model.list,
resampling = rsmp("cv", folds =3)
)
bmr <- benchmark(grid, store_models = TRUE)
Note that I added Poisson loss as a measure for the count data I am working with. For some reason after running the benchmark function, the Poisson loss of all my models is nearly identical per fold, making me think that no tuning was done.
I also cannot find a way to access the hyperparameters used to get the lowest loss per train/test iteration. Am I misusing the benchmark function entirely? Also, this is my first question on SO, so any formatting advice would be appreciated!