1

I am currently doing a regression using mlr3 lrn('regr.cv_glmnet'). I am doing a benchmark grid to determine whether linear regression vs cross validated lasso works better. By using default values with regr.cv_glmnet, the lasso works better, but I can't seem to figure out how to get the lambda value that was selected.

lm_learner <- lrn('regr.lm')
lasso_learner <- lrn('regr.cv_glmnet')
lasso_learner$param_set$values <- list(alpha=1, nfolds=10)
lasso_gr <- po('encode') %>>% po('scale') %>>% po(lasso_learner)
lasso_glrn <- GraphLearner$new(lasso_gr)

benchmark_grid(tasks=task, learners=c(lm_learner, lasso_glrn), resamplings=resampling)

How do I get the lambda.min value?

Tan YX
  • 11
  • 1

1 Answers1

2

If you use a GraphLearner you have to dig a little deeper for the model. And call bmr() with store_models = TRUE.

library(mlr3verse)

task = tsk("mtcars")
resampling = rsmp("cv", folds = 10)

lm_learner = lrn('regr.lm')
lasso_learner = lrn('regr.cv_glmnet')
lasso_learner$param_set$values = list(alpha=1, nfolds=10)
lasso_gr = po('encode') %>>% po('scale') %>>% po(lasso_learner)
lasso_glrn = GraphLearner$new(lasso_gr)

design = benchmark_grid(tasks=task, learners = c(lm_learner, lasso_glrn), resamplings = resampling)
bmr = benchmark(design, store_models = TRUE)

model = bmr$resample_result(2)$learners[[1]]$model$regr.cv_glmnet$model

model$lambda.min

#> [1] 0.112483
be-marc
  • 1,276
  • 5
  • 5
  • Thanks for the help. So now that I have the 10 folds on the regr.cv_glmnet model, with 10 different lambda.min, how do I select the lambda value for my final model? – Tan YX Mar 09 '23 at 12:11
  • Using `"regr.cv_glmnet"` with `resample()` or `bmr()` is nested resampling. Nested resampling is not used to select optimal hyperparameters. See the [book chapter](https://mlr3book.mlr-org.com/optimization.html#sec-nested-resampling). – be-marc Mar 09 '23 at 13:25
  • I don't understand the chapter. So how do we select the optimal hyper parameters, if we want to use a cross-validation on the performance of the model? – Tan YX Mar 09 '23 at 14:05
  • Then I can only recommend you to read the chapter again. Understanding nested resampling is very important. Call `lasso_glrn$train(task)` to get the final model with lambda tuned by cross-validation. Run `resample(task, lasso_glrn, resampling)` to estimate performance with nested resampling. `"regr.cv_glmnet"` is different from the other learners because it makes a CV internally. – be-marc Mar 09 '23 at 16:27