1

I get two different performance metrics when I run this code two times in a row? and I'm not sure I understand why this is happening as I'm using the same training and testing set. I'm setting the seed in the beginning as well.

set.seed(42)
data(BostonHousing, package = "mlbench")
regr.task = makeRegrTask(id = "bh", data = BostonHousing, target = "medv")

lrn = makeLearner("regr.ctree")

outer=makeResampleInstance(makeResampleDesc("Holdout"),task=regr.task)
r = resample(
  learner = lrn,
  task = regr.task,
  resampling = outer,
  show.info = TRUE
)

This is what I get running the code the first time:

Resampling: holdout
Measures:             mse       
[Resample] iter 1:    20.5713143


Aggregated Result: mse.test.mean=20.5713143

This is what I get running the code the second time:

Resampling: holdout
Measures:             mse       
[Resample] iter 1:    21.9437349


Aggregated Result: mse.test.mean=21.9437349
Ashti
  • 193
  • 1
  • 10
  • 2
    I replicated your codes, and I get the same results maybe 10 times. You need to call `set.seed(42)` each time you run the process. What I understand is that you run seed once and want to get same results in each trial. – maydin Aug 19 '19 at 13:54
  • if you do it for lrn=makeLearner("regr.bartMachine"), you will get different results by running the set.seed (42) before the code is executed. – Ashti Aug 19 '19 at 14:43
  • 1
    Some machine learning models don't respect the random seed that was set or require it to be set in a specific way. This might be an instance of this. – Lars Kotthoff Aug 19 '19 at 17:02

1 Answers1

1

mlr is not doing any changes to the global seed. It works for most learners including the one shown in your example ("regr.ctree").

If in some cases it does not work this is a problem of the underlying learner. In these cases you might want to read the docs of the learner to find out how to get reproducible results.

pat-s
  • 5,992
  • 1
  • 32
  • 60