Say I have a machine with 32 cores and want to execute a nested CV of 5 outer folds and 3 inner folds as efficiently as possible.
On the outer fold, I benchmark two or more learners, on the inner fold I tune hyperparams for one or n-k of the learners.
How to set batch_size and future::plan()?
How does term_eval depend on the batch size?
Would this be sensible? My hunch would be to better run the inner loop in parallel. But I am unsure about term_evals and batch_size.
lrn1 <- auto_tuner(
method = "random_search",
learner = lrn1,
resampling = rsmp("cv", folds = 3),
measure = msr("classif.auc"),
term_evals = 100,
batch_size = 10,
store_models = TRUE
)
design = benchmark_grid(task = task, learner = c(lrn1, lrn2), resampling = rsmp("cv", folds = 5))
# Runs the inner loop in parallel and the outer loop sequentially
future::plan(list("sequential", "multisession"))
bmr = benchmark(design, store_models = TRUE)