mlr3 Tune on Multiple Measures & Distrcompositor

Question

I get an error when I try to auto-tune on cindex and IBS. I can only use one performance measure. This is also the case for auto-selector that accepts only one performance measure.

I also have a problem with distrcompositor to allow calculating IBS. It only works when I tune the model but not with train with default configuration. This is also the case for surv.gbm and surv.xgboost.

options(warn=-1)

library(tidyverse)
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(PKPDmisc)
library(mlr3verse)
#> Loading required package: mlr3
#> 
#> Attaching package: 'mlr3verse'
#> The following object is masked from 'package:tune':
#> 
#>     tune
library(survival)
library(mlr3proba)

# Data and Data Splitting  
data = as_tibble(lung) %>% 
       mutate(status = if_else(status==1, 0, 1),
              sex = factor(sex, levels = c(1:2), labels = c("male", "female")),
              ph.ecog = factor(ph.ecog))

na              <- sample(1:228, 228*0.1)
data$sex[na]    <- NA
data$ph.ecog[na]<- NA

set.seed(123)
split <- data  %>% initial_split(prop = 0.8, strata = status)    
train <- split %>% training()
test  <- split %>% testing()

# Task 
Task = TaskSurv$new(id = "Lung", backend = train,  time = "time", event = "status") 
Task$add_strata("status")

# Resample
set.seed(123)
inner_cv  = rsmp("holdout", ratio = 0.9)

# performance measure 
measures = msrs(c("surv.cindex", "surv.graf")) # 2nd distrcompositor needed 

# Learner 
sim_impute  = po("imputemedian", affect_columns = selector_type("numeric")) %>>%
              po("imputemode",   affect_columns = selector_type("factor"))  %>>% 
              po("scale") %>>%
              po("encode", method = "one-hot")

cox_lasso = ppl("distrcompositor",
                learner   = sim_impute %>>%
                  lrn("surv.glmnet"),
                estimator = "kaplan", form = "ph")

cox_lasso2 = ppl("distrcompositor",
                learner   = sim_impute %>>%
                            lrn("surv.glmnet", 
                                lambda = to_tune(p_dbl(lower = -2.5, upper = -1.25,
                                                       trafo = function(x) 10^x))), 
                estimator = "kaplan", form = "ph")


at_coxlasso2 = AutoTuner$new(learner    = cox_lasso2,
                             resampling = inner_cv,
                             measure    = msr("surv.cindex"), # measures
                             # Error in UseMethod("as_measure") : 
                             #   no applicable method for 'as_measure' applied to an object of class "list"
                             terminator = trm("evals", n_evals = 20),
                             tuner      = tnr("grid_search", resolution = 10))

# Train & Predict 
lgr::get_logger("mlr3")$set_threshold("warn")

cox_lasso$train(Task)
#> $compose_distr.output
#> NULL
cox_lasso$predict_newdata(test)$score(measures)
#> Error in eval(expr, envir, enclos): attempt to apply non-function

at_coxlasso2$train(Task)
#> INFO  [10:21:07.481] [bbotk] Starting to optimize 1 parameter(s) with '<OptimizerGridSearch>' and '<TerminatorEvals> [n_evals=20, k=0]' 
#> INFO  [10:21:07.520] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:09.051] [bbotk] Result of batch 1: 
#> INFO  [10:21:09.054] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:09.054] [bbotk]           -1.666667      0.7054263             1.22 
#> INFO  [10:21:09.054] [bbotk]                                 uhash 
#> INFO  [10:21:09.054] [bbotk]  84a8b0e1-8a65-4c53-aae7-a885dc07d14b 
#> INFO  [10:21:09.056] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:10.430] [bbotk] Result of batch 2: 
#> INFO  [10:21:10.432] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:10.432] [bbotk]           -1.944444      0.7286822             1.15 
#> INFO  [10:21:10.432] [bbotk]                                 uhash 
#> INFO  [10:21:10.432] [bbotk]  7da28c83-0388-4571-8bc6-da0cacaee8f8 
#> INFO  [10:21:10.434] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:11.569] [bbotk] Result of batch 3: 
#> INFO  [10:21:11.572] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:11.572] [bbotk]               -1.25      0.6899225             0.95 
#> INFO  [10:21:11.572] [bbotk]                                 uhash 
#> INFO  [10:21:11.572] [bbotk]  e0c8ba11-eafa-42ce-803a-b070f813c640 
#> INFO  [10:21:11.573] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:12.687] [bbotk] Result of batch 4: 
#> INFO  [10:21:12.689] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:12.689] [bbotk]           -2.083333      0.7209302             0.92 
#> INFO  [10:21:12.689] [bbotk]                                 uhash 
#> INFO  [10:21:12.689] [bbotk]  77398f50-3753-4729-ab7d-2c1bb9f6d4d1 
#> INFO  [10:21:12.691] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:13.814] [bbotk] Result of batch 5: 
#> INFO  [10:21:13.816] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:13.816] [bbotk]           -1.527778      0.7054263             0.92 
#> INFO  [10:21:13.816] [bbotk]                                 uhash 
#> INFO  [10:21:13.816] [bbotk]  b29848f5-cbff-4748-b19d-81ff1a8eccd3 
#> INFO  [10:21:13.818] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:14.915] [bbotk] Result of batch 6: 
#> INFO  [10:21:14.917] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:14.917] [bbotk]                -2.5      0.7054263             0.89 
#> INFO  [10:21:14.917] [bbotk]                                 uhash 
#> INFO  [10:21:14.917] [bbotk]  88966b60-3357-4282-b7e5-8d499d7d6fa8 
#> INFO  [10:21:14.918] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:15.977] [bbotk] Result of batch 7: 
#> INFO  [10:21:15.979] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:15.979] [bbotk]           -1.805556      0.7209302             0.88 
#> INFO  [10:21:15.979] [bbotk]                                 uhash 
#> INFO  [10:21:15.979] [bbotk]  c0e8ad78-4312-4780-9a16-e07649b60d8d 
#> INFO  [10:21:15.981] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:17.050] [bbotk] Result of batch 8: 
#> INFO  [10:21:17.052] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:17.052] [bbotk]           -2.222222      0.7131783             0.85 
#> INFO  [10:21:17.052] [bbotk]                                 uhash 
#> INFO  [10:21:17.052] [bbotk]  9d4973e8-bff8-47ef-b85c-a24eba35fef0 
#> INFO  [10:21:17.054] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:18.499] [bbotk] Result of batch 9: 
#> INFO  [10:21:18.501] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:18.501] [bbotk]           -2.361111      0.7054263             1.24 
#> INFO  [10:21:18.501] [bbotk]                                 uhash 
#> INFO  [10:21:18.501] [bbotk]  f35353c7-2f89-4507-97c7-949d59874ef6 
#> INFO  [10:21:18.503] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [10:21:19.640] [bbotk] Result of batch 10: 
#> INFO  [10:21:19.642] [bbotk]  surv.glmnet.lambda surv.harrell_c runtime_learners 
#> INFO  [10:21:19.642] [bbotk]           -1.388889      0.6899225             0.95 
#> INFO  [10:21:19.642] [bbotk]                                 uhash 
#> INFO  [10:21:19.642] [bbotk]  4f350f84-a1dd-4b2b-8c4c-d374905add1c 
#> INFO  [10:21:19.652] [bbotk] Finished optimizing after 10 evaluation(s) 
#> INFO  [10:21:19.653] [bbotk] Result: 
#> INFO  [10:21:19.654] [bbotk]  surv.glmnet.lambda learner_param_vals  x_domain surv.harrell_c 
#> INFO  [10:21:19.654] [bbotk]           -1.944444          <list[7]> <list[1]>      0.7286822
at_coxlasso2$predict_newdata(test)$score(measures)
#> surv.harrell_c      surv.graf 
#>      0.6315120      0.1331544

^{Created on 2022-01-14 by the reprex package (v2.0.1)}

score 1 · Answer 1 · answered Jan 14 '22 at 22:18

Heya thanks for the question! Basically you just needed to wrap the graph in as_learner or GraphLearner$new. Below is working reprex, I've simplified the code slightly to minimise dependencies (but also to show you the mlr3 functions you can use instead of tidymodels))

library(survival)
library(mlr3proba)
#> Loading required package: mlr3
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(mlr3pipelines)
library(mlr3learners)
#> Warning: package 'mlr3learners' was built under R version 4.1.1
library(mlr3tuning)
#> Loading required package: paradox

data = lung %>% 
       mutate(status = status - 1,
              sex = factor(sex, levels = c(1:2), labels = c("male", "female")),
              ph.ecog = factor(ph.ecog))

na              <- sample(1:228, 228 * 0.1)
data$sex[na]    <- NA
data$ph.ecog[na]<- NA

set.seed(123)

# Task
Task = as_task_surv(lung, event = "status")
data = partition(Task)
Task$add_strata("status")

# Resample
set.seed(123)
inner_cv  = rsmp("holdout", ratio = 0.9)

# performance measure 
measures = msrs(c("surv.cindex", "surv.graf")) # 2nd distrcompositor needed 

# Learner 
sim_impute  = po("imputemedian", affect_columns = selector_type("numeric")) %>>%
              po("imputemode",   affect_columns = selector_type("factor"))  %>>% 
              po("scale") %>>%
              po("encode", method = "one-hot")

cox_lasso = as_learner(ppl("distrcompositor",
  learner   = sim_impute %>>% lrn("surv.glmnet"),
  estimator = "kaplan", form = "ph"
))

cox_lasso2 = as_learner(ppl(
  "distrcompositor",
  learner = sim_impute %>>%
    lrn("surv.glmnet",
      lambda = to_tune(p_dbl(
        lower = -2.5, upper = -1.25,
        trafo = function(x) 10^x
      ))
    ),
  estimator = "kaplan", form = "ph"
))

at_coxlasso2 = AutoTuner$new(learner    = cox_lasso2,
                             resampling = inner_cv,
                             measure    = msr("surv.cindex"), # measures
                             terminator = trm("evals", n_evals = 20),
                             tuner      = tnr("grid_search", resolution = 10))

# Train & Predict 
lgr::get_logger("mlr3")$set_threshold("warn")

cox_lasso$train(Task, row_ids = data$train)
cox_lasso$predict(Task, row_ids = data$test)$score(measures)
#> Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see
#> parameter 's').
#> surv.cindex   surv.graf 
#>   0.5893720   0.1351024

at_coxlasso2$train(Task)
#> INFO  [22:17:22.894] [bbotk] Starting to optimize 1 parameter(s) with '<OptimizerGridSearch>' and '<TerminatorEvals> [n_evals=20]' 
#> INFO  [22:17:22.906] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:23.130] [bbotk] Result of batch 1: 
#> INFO  [22:17:23.132] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:23.132] [bbotk]           -1.666667   0.5615764 fa0e1e34-778e-484e-a1b1-ef1b5006f46e 
#> INFO  [22:17:23.132] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:23.417] [bbotk] Result of batch 2: 
#> INFO  [22:17:23.417] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:23.417] [bbotk]           -1.805556   0.5566502 7c180a50-5331-4cd3-a0fb-df411775a02a 
#> INFO  [22:17:23.418] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:23.570] [bbotk] Result of batch 3: 
#> INFO  [22:17:23.571] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:23.571] [bbotk]           -1.388889   0.5812808 21414438-63fe-4df6-8f2b-e21010096097 
#> INFO  [22:17:23.571] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:23.737] [bbotk] Result of batch 4: 
#> INFO  [22:17:23.738] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:23.738] [bbotk]           -2.083333   0.5763547 6e06b2cf-0157-4109-b573-2f31545e6d73 
#> INFO  [22:17:23.739] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:23.891] [bbotk] Result of batch 5: 
#> INFO  [22:17:23.892] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:23.892] [bbotk]           -2.361111   0.5714286 248de8a9-d857-4eb0-9550-c01439326885 
#> INFO  [22:17:23.892] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:24.047] [bbotk] Result of batch 6: 
#> INFO  [22:17:24.048] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:24.048] [bbotk]           -1.527778   0.5615764 5c1d2c8a-a8c0-4998-bcc5-1f0589bd2226 
#> INFO  [22:17:24.049] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:24.199] [bbotk] Result of batch 7: 
#> INFO  [22:17:24.200] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:24.200] [bbotk]           -2.222222   0.5763547 63bd2a9d-c886-497a-b905-84445b3445c4 
#> INFO  [22:17:24.200] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:24.359] [bbotk] Result of batch 8: 
#> INFO  [22:17:24.360] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:24.360] [bbotk]               -1.25   0.6108375 c1a4ea01-aaf0-45ee-b7f2-60010feb6db3 
#> INFO  [22:17:24.360] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:24.525] [bbotk] Result of batch 9: 
#> INFO  [22:17:24.526] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:24.526] [bbotk]           -1.944444   0.5665025 b29d5430-3f38-4b90-b906-9736499a1c88 
#> INFO  [22:17:24.527] [bbotk] Evaluating 1 configuration(s) 
#> INFO  [22:17:24.688] [bbotk] Result of batch 10: 
#> INFO  [22:17:24.689] [bbotk]  surv.glmnet.lambda surv.cindex                                uhash 
#> INFO  [22:17:24.689] [bbotk]                -2.5   0.5714286 a9e18d67-6704-482d-a71d-5dffc03d1d7e 
#> INFO  [22:17:24.691] [bbotk] Finished optimizing after 10 evaluation(s) 
#> INFO  [22:17:24.692] [bbotk] Result: 
#> INFO  [22:17:24.692] [bbotk]  surv.glmnet.lambda learner_param_vals  x_domain surv.cindex 
#> INFO  [22:17:24.692] [bbotk]               -1.25          <list[7]> <list[1]>   0.6108375
at_coxlasso2$predict(Task, row_ids = data$test)$score(measures)
#> surv.cindex   surv.graf 
#>   0.6652174   0.1229131

^{Created on 2022-01-14 by the reprex package (v2.0.1)}

@RraphaelS, Thanks for pointing this out. How about tuning on cindex and IBS? Any any idea where it does not work with auto-tuner and auto-fselector? — Ali Alhadab, Jan 15 '22 at 05:34
An auto-tuner assumes the result of the tuning is one configuration of parameters. However multi-objective tuning results in the [Pareto front](https://en.wikipedia.org/wiki/Pareto_efficiency) so multiple configurations — RaphaelS, Jan 18 '22 at 08:27

mlr3 Tune on Multiple Measures & Distrcompositor

1 Answers1