2

Is it possible to set a seed for h2o models via mlr? I could only find how to do it in h2o directly, e.g.

gbm_w_seed_2 <- h2o.gbm(x = predictors, y = response, training_frame = train,
                        validation_frame = valid, col_sample_rate =.7 ,
                        seed = 1234)
tover
  • 535
  • 4
  • 11

1 Answers1

3

Yes, these are exposed as learner parameters. For example:

lrn = makeLearner("classif.h2ogbm", par.vals = list(seed = 123))
Lars Kotthoff
  • 107,425
  • 16
  • 204
  • 204
  • Thanks! This works, however only for non parallelized estimation. Do you happend to know a way to use a seed when using multiple threads? – tover Jul 11 '17 at 20:52
  • Not sure what you mean. Does h2o have a separate seed for parallel estimates? – Lars Kotthoff Jul 12 '17 at 00:04
  • I don't know, but I was hoping it would. Since I couldn't fine one I thought I just ask. – tover Jul 12 '17 at 07:28
  • AFAIK the seed should work in all cases. Otherwise it's probably an h2o bug? – Lars Kotthoff Jul 12 '17 at 17:37
  • Well, in the example I've tested it worked with h2o.init(nthreads=1), but not with h2o.init(nthreads=2). I also read that there is a seed (set.seed(123, "L'Ecuyer")) that works with parallelization in mlr (when there is no h2o involved), but not for windows systems. Maybe the same underlying problem applies here? (found it here: https://github.com/mlr-org/mlr/issues/938) – tover Jul 12 '17 at 18:45
  • This is definitely an h2o issue and not an mlr issue -- mlr only passes the seed to the learner, it doesn't control what happens internally. – Lars Kotthoff Jul 12 '17 at 19:54
  • I didn't say it might be an mlr issue, but since with h2o free mlr seeds aren't working for parallel computation on windows, maybe in h2o without mlr it is implemented in a similar way and therefore faces the same problems. – tover Jul 12 '17 at 21:31