1

How can I determine which fold was finally used as a test and which fold as training in 5 fold crossvalidation in the mlr package? Methods $resampling$train.inds and $resampling$test.inds returns all 5 folds without the information that eventually served to train and test purpose.

library("mlr")

regr_task = makeRegrTask(data = mtcars, target = "hp")
learner = makeLearner("regr.ranger", 
                      importance = "impurity", 
                      num.threads = 3)
par_set = makeParamSet(
   makeIntegerParam("num.trees", lower = 100L, upper = 500L),
   makeIntegerParam("mtry", lower = 4L, upper = 8L)
)
rdesc = makeResampleDesc("CV", iters = 5, predict = "both")
meas = rmse
ctrl = makeTuneControlGrid()
set.seed(1)
tuned_model = tuneParams(learner = learner,
                         task = regr_task,
                         resampling = rdesc,
                         measures = list(meas, setAggregation(meas, train.mean)),
                         par.set = par_set,
                         control = ctrl,
                         show.info = FALSE)
tuned_model
model_rf = setHyperPars(learner = learner, par.vals = tuned_model$x)
set.seed(1)
model_rf = train(learner = model_rf, task = regr_task)
model_rf

tuned_model$resampling$train.inds
tuned_model$resampling$test.inds
lodomi
  • 67
  • 3

1 Answers1

2

You're mixing things up here.

You are splitting your data into 5 folds. Each fold consists of training and testing data. This is why you get back a list of 5 for both $resampling$train.inds and $resampling$test.inds. If you split into 5 folds, you will train on 4 partitions (80% of the data) and evaluate on 1 partition (20% of the data).

The right wording would be: "Which indices where used in which fold for training and testing?". The code below answers this.

tuned_model$resampling$train.inds
[[1]]
 [1] 10 32  6 15 20 28 26 12  8 24 31 27 22  2 13 29 17 11  1  3 16 18 21 19  9  5

[[2]]
 [1] 10  6 15 28 26 12 23 30  8 25 24  7 31 27 14  2 13 29 17  1 16  4 21 19  9

[[3]]
 [1] 10 32 20 26 12 23 30  8 25  7 27 22 14  2 13 29 17 11  1  3 16 18  4 19  5

[[4]]
 [1] 32  6 15 20 28 26 12 23 30 25 24  7 31 22 14 13 17 11  1  3 18  4 21 19  9  5

[[5]]
 [1] 10 32  6 15 20 28 23 30  8 25 24  7 31 27 22 14  2 29 11  3 16 18  4 21  9  5

> tuned_model$resampling$test.inds
[[1]]
[1]  4  7 14 23 25 30

[[2]]
[1]  3  5 11 18 20 22 32

[[3]]
[1]  6  9 15 21 24 28 31

[[4]]
[1]  2  8 10 16 27 29

[[5]]
[1]  1 12 13 17 19 26
pat-s
  • 5,992
  • 1
  • 32
  • 60
  • Yeah, but finally which observations were used to train and which to test the model? I would like to receive information on which observations have been exactly used for training and testing in order to use them for example in another worflow by different package. – lodomi Oct 09 '19 at 20:13
  • The ones shown in my answer? I do not understand what you mean by "finally used". I assume you might misunderstand something conceptually. Also you should not use specific observations for other workflows. – pat-s Oct 09 '19 at 21:28
  • I understand this in such a way that fold 1 is selected as test and folds 2-4 as training and we get a test error of 1.52 and training error of 0.52. Then fold 2 is test, and folds 1 and 3-5 are training, we get a test error of 0.82 and a training error of 0.32, and so on. In my thinking, we are looking for a test folder that returns the smallest possible error. So, for example, in the end we determine that fold number 3 is test (because it had the smallest error), and the other folds are training. Is this correct? – lodomi Oct 09 '19 at 22:17
  • No this is incorrect. This is not the right place to discuss this. I'd suggest to read on the theory of cross-validation again. – pat-s Oct 10 '19 at 06:03