How to test our model in mlr3 with nested hyperparameter optimization

Question

I have just started learning mlr3 and have read the mlr3 book (parameters optimization). In the book, they provided an example for the nested hyperparameters but I do not know how to provide the final prediction i.e. predict (model, test data). The following code provides learner, task, inner resampling (holdout), outer-resampling (3-fold CV), and grid search for tuning. My questions are:

(1) Dont we need to train the optimized model i.e. at in this case like train(at, task) ?

(2) After train, how to predict the data with test data as I am not seeing any splits of train and test data?

The code taken from mlr3 book (https://mlr3book.mlr-org.com/nested-resampling.html) is as follows:  

library("mlr3tuning")
task = tsk("iris")
learner = lrn("classif.rpart")
resampling = rsmp("holdout")
measure = msr("classif.ce")
param_set = paradox::ParamSet$new(
  params = list(paradox::ParamDbl$new("cp", lower = 0.001, upper = 0.1)))
terminator = trm("evals", n_evals = 5)
tuner = tnr("grid_search", resolution = 10)

at = AutoTuner$new(learner, resampling, measure = measure,
  param_set, terminator, tuner = tuner)

rr = resample(task = task, learner = at, resampling = resampling_outer)

This is the common "CV vs train/predict" misunderstanding. I'll probably write a blog post soon about this to avoid repeating myself over and over in such questions :) — pat-s, Dec 16 '20 at 06:17
@pat-s, I will wait for your post but could u plz respond to my question here briefly? I am confused at this point and unable to proceed. — khan1, Dec 16 '20 at 10:06
Do some research about what cross-validation is meant for and what it returns (and how it differs to a 'normal' train/predict. There is plenty of material out there :) Also search a bit on https://stats.stackexchange.com/. That's all I can say in two sentences for now. — pat-s, Dec 16 '20 at 10:09
@pat-s, thank you for your suggestion. I have used them in R language (not mlr3). First we used to divide the data into train and test, then performed CV with train data and train the model. Then performed the prediction with test data. I have just a confusion in mlr3. In the mlr3 gallery German credit example, they have divided the data into train and test set but when they were using the hyperparameters optimization in the same tutorial, they did not divide the data into train/test as the code I provided above. — khan1, Dec 16 '20 at 10:29
The code you're using is splitting the data into different train and test sets multiple times, each time evaluating the performance of the model and making decisions with respect to the best hyperparameters based on that. You should see information on the observed performance while running your code. — Lars Kotthoff, Dec 16 '20 at 16:45
@LarsKotthoff, Ok so no need to explicitly split the data into train and test sets? — khan1, Dec 16 '20 at 18:05
That's correct. Unless you want to evaluate in a very specific way that you define yourself. — Lars Kotthoff, Dec 16 '20 at 19:29
@LarsKotthoff thanks again for your comments. But just a confusion raises in my mind. Would it be unbiased if we use train (full iris dataset) and predict(full iris dataset). I am sorry but I came from caret library background, so it feels like strange to me.. — khan1, Dec 16 '20 at 20:08
`mlr3` does the splitting into train and test based on the resampling method you provide. It doesn't train and test on the same data unless you tell it to. — Lars Kotthoff, Dec 16 '20 at 23:10
@LarsKotthoff, ok I got your point.. Thanks a lot. I read in the book that we can specifiy the train/test ratio in the resampling i.e. rsmp("holdout", ratio = 0.8) — khan1, Dec 17 '20 at 22:04
data=read.csv("results.csv") task=TaskRegr$new("data", data, target = "Results") learner= lrn("regr.rpart") resampling = rsmp("holdout") measure = msr("regr.mae") search_space = ParamSet$new( params = list(ParamDbl$new("cp", lower = 0.001, upper = 0.1))) terminator = trm("evals", n_evals = 5) tuner = tnr("grid_search", resolution = 10) — khan1, Dec 18 '20 at 23:22
at = AutoTuner$new( learner = learner, resampling = resampling, measure = measure, search_space = search_space, terminator = terminator, tuner = tuner ) resampling_outer = rsmp("cv", folds = 10) rr = resample(task = task, learner = at, resampling = resampling_outer) — khan1, Dec 18 '20 at 23:22
@LarsKotthoff, is the above code the correct way to optimize the parameters? — khan1, Dec 18 '20 at 23:23
Looks technically correct to me, but I would use random search and a much larger number of evaluations. — Lars Kotthoff, Dec 19 '20 at 00:06
@LarsKotthoff, thanks for your feedback.. I know that larger number of evaluations would be more better, but why not grid search? — khan1, Dec 19 '20 at 00:40
@LarsKotthoff, after we use the benchmark (), how we can get the individual performance (for all the the folds of CV) of the tuned and untuned learner? grid = benchmark_grid( task = task, learner = list(at, lrn("regr.rpart")), resampling = rsmp("cv", folds = 10) ) bmr = benchmark(grid) — khan1, Dec 19 '20 at 16:54
Grid search is usually inefficient. And see https://mlr3book.mlr-org.com/benchmarking.html — Lars Kotthoff, Dec 19 '20 at 19:57
@LarsKotthoff, thanks a lot again.. I have read this benchmarking... I want to perform the wilcoxon test for the performance estimates (i.e. MAE values) obtained with the tuned learner and with the untuned learner. How can I obtain the MAE values of all the folds of tuned and untuned learners. For instance, wilcox.test (MAE with tuned, MAE with untuned). — khan1, Dec 19 '20 at 20:34
Ok thank you @LarsKotthoff.. I am going to check this section. — khan1, Dec 19 '20 at 21:27
@LarsKotthoff, though I did not find the section 2.6.4 very helpful but I have used the following code to extract the MAE values of the tuned learner and untuned learner. First three MAE values are of one learner and the other three MAE values are of another learner. The code is here and it gives me the p values. resampRF=bmr$score(msr("regr.mae"))[1:3,11] resampRPART= bmr$score(msr("regr.mae"))[4:6,11] wilcox.test(resampRF$regr.mae, resampRPART$regr.mae, pair=T) — khan1, Dec 19 '20 at 23:04
I used random search to tune params of RF, how to find the optimal value of the 'mtry' parameter? at = AutoTuner$new( learner = learner, resampling = resampling, measure = measure, search_space = search_space, terminator = terminator, tuner = tuner ) resamplingouter = rsmp("cv", folds = 5) grid = benchmark_grid(task = task, learner = list(at, lrn("regr.rpart")),resamplingouter) bmr = benchmark(grid) — khan1, Dec 24 '20 at 20:08

score 1 · Answer 1 · answered Jan 07 '21 at 13:14

1

See The "Cross-Validation - Train/Predict" misunderstanding.

answered Jan 07 '21 at 13:14

pat-s

5,992
1
32
60

thank you... Why should not use the benchmark and resample functions as you mentioned in the article. It is still a confusion in my mind in which cases we would need to use resample and in which cases we need the train() and predict() methods. – khan1 Jan 10 '21 at 14:38

How to test our model in mlr3 with nested hyperparameter optimization

1 Answers1