0

Example codes

library(mlr3verse)
library(paradox)
library(drake)


my_plan = drake::drake_plan(
  # learner
  learner_classif = lrn(
    "classif.ranger",
    predict_type = "prob"
  ),
  
  # task 
  task = tsk("german_credit"),
  
  # set search_space
  ps_classif = ParamSet$new(list(
    ParamInt$new("num.trees", lower = 300, upper = 500),
    ParamDbl$new("sample.fraction", lower = 0.7, upper = 0.8)
  )),
  
  # auto tunning
  at = AutoTuner$new(
    learner = learner_classif, 
    resampling = rsmp("cv", folds = 3),
    measure = msr("classif.auc"), 
    search_space = ps_classif, 
    terminator = trm("evals", n_evals = 1000), 
    tuner = tnr("random_search")
  ),
  
  # sampling
  rr = resample(task, at, rsmp("cv", folds = 2))
)

make(my_plan)

I have a problem when tuning model in mlr3. If the model has a lot of nodes' in the graph or n_evals` too many. I cant run during the day. I intend to divide this job to 2 days: 50% in first day, 50% in second day.

May i ask.

How to append tuned results at the first day and second day?

Or how to i can stop tuning at anytime and continue at another time (while the result is still enough) ?

Thanks !!!

Lars Kotthoff
  • 107,425
  • 16
  • 204
  • 204
BinhNN
  • 87
  • 7
  • I don't think this is possible, especially not if you wrap it with {drake} (which is btw superseded since January). `n_evals = 1000` is usually way too high but I am not sure if this is just an arbitrary number chosen for this example. Also your question is not really related to {drake} and might make it more complex for people to answer (both {mlr3} and {drake} community people). Adjusting your tuning budget and narrowing down your search space might be a possible solution. – pat-s Jun 22 '21 at 09:19
  • Thank @pat-s. I tag `drake` because I think drake can save workflow, maybe it can save tuning results. That's great if `drake` community people solved this problem. – BinhNN Jun 22 '21 at 09:35
  • FYI I think you meant [`drake-r-package`](https://stackoverflow.com/questions/tagged/drake-r-package) for your tag? – Eric Cousineau Jun 22 '21 at 12:11
  • While `drake` is great it does "nothing more" than serializing your output object to disk in this particular case. There is no intermediate state preservation of an ongoing R command. And Eric is right, the tag for `drake` should be changed. – pat-s Jun 22 '21 at 14:53
  • If some of those targets complete on day 1 and you interrupt the pipeline at the end of the day, then day 2 should pick up where day 1 left off as long as those previously completed targets remain up to date. But `drake` does not know how to divide computation into days, so I recommend keeping the pipeline running overnight. – landau Jun 22 '21 at 21:22
  • Thank you. It's the example. In case, if i have big target, i dont want to divide to a lot of small targets. How can I merge those small targets to 1 target. – BinhNN Jun 23 '21 at 02:50

0 Answers0