2

I am trying to fit coxph and parametric models and simultaneously perform feature selection and hyperparameter tuning. I have the following code below where I can use either auto_fselecter or auto_tuner inside resample but not both. How do I do that? Do I need to have 3 nested resampling (inner for feature selection, middle for tuning and outer for performance evaluation)? In mlr it was easily done where we use feature selection wrapper then tuning wrapper but not sure how it is best done in mlr3.

I also want to get the selected features at the end. It seems learner$selected_features() does not work for survival models

task       = tsk("rats")
learner    = lrn("surv.coxph")

outer_cv   = rsmp("cv", folds = 10)$instantiate(task)
inner_cv   = rsmp("cv", folds = 10)$instantiate(task) 

Feat_select= auto_fselecter(method       = "random_search", 
                            learner      = learner,
                            resampling   = inner_cv,
                            measure      = msr("x"), 
                            term_evals   = 200)

model_tune = auto_tuner(method       = "irace", 
                        learner      = learner,
                        resampling   = inner_cv,
                        measure      = msr("x"),
                        search_space = ps())

model_res  = resample(task, model_tune , outer_cv, store_models = TRUE)


task       = tsk("rats")

learner2   = as_learner(po("encode") %>>% lrn("surv.cv_glmnet"))
learner2$selected_features()
Error: attempt to apply non-function

learner3 = mlr3extralearners::lrn("surv.rsfsrc")
learner$selected_features()
Error: attempt to apply non-function
Ali Alhadab
  • 101
  • 5
  • maybe use an `auto_tuner` in `auto_fselecter`... `auto_fselecter(model = model_tuner...)` I am not completely sure if it works, haven't used mlr3 wrapper feature selection. However it seems to me that it should. In this case you would need inner resampling for model tuning, middle resampling for feature selection and outer resampling for model evaluation. If my suggestion does not work please post a comment and I will try to figure out the correct approach. – missuse Dec 07 '21 at 17:01

1 Answers1

2

You can nest AutoTuner and AutoFSelector in mlr3:

library(mlr3tuning)
library(mlr3fselect)

task = tsk("pima")

at = auto_tuner(
    method = "random_search",
    learner = lrn("classif.rpart", cp = to_tune(0.01, 0.1)),
    resampling = rsmp("cv", folds = 3),
    measure = msr("classif.ce"),
    term_evals = 5
)

afs = auto_fselector(
    method = "random_search",
    learner = at,
    resampling = rsmp("cv", folds = 3),
    measure = msr("classif.ce"),
    term_evals = 5
)

rr = resample(task, afs, resampling = rsmp("cv", folds = 3), store_models = TRUE)

extract_inner_fselect_results(rr)
be-marc
  • 1,276
  • 5
  • 5
  • I was thinking about this and I am glad to see other people think the same way. I think the order matter and I should be doing feature selection first then tuning. This order will allow me to tune number of parameter in feature selection if I want. Do you agree? – Ali Alhadab Dec 08 '21 at 04:48
  • @Ali Alhadab be-marc is one of the developers of mlr3. In addition, the correct order is the one posted in the answer. For each specific feature subset tune the hyper parameters of the learner in the inner resampling loop and after you tune them evaluate them in the middle resampling loop to select the optimal feature subset. The outer loop is used for unbiased evaluation of the whole thing. – missuse Dec 08 '21 at 06:49
  • Exactly, this is what I was thinking, hyperparameters tuning is performed for each specific feature subset. In MLR, feature selection wrapper is always used before the tunewraper and I thought it would be the same order in MLR3. Here are additional questions to @be_marc: 1. How do we add additional arguments like max_feature in auto_fselector to be tuned in Param_set? – Ali Alhadab Dec 09 '21 at 02:27
  • 2. FSelectorSequential has either SFS or SBS strategy. How do I perform SFS followed by SBS? Should I use nested auto_fselectors? Do I nest the auto_fselector with SBS inside the auto_fselector with SFS? I assume this is the correct order given your order for feature selection and tuning. Are there arguments to specify the level for inclusion and exclusion of features as in MLR (alpha and beta)? Can they additional argument be tuned? 3. Are the SFFS and SFBS strategies available in MLR3 as in MLR? – Ali Alhadab Dec 09 '21 at 02:27
  • So you want to additionally tune the parameter of the feature selection method? – be-marc Dec 11 '21 at 08:22
  • 2) We do not support SFS followed by SBS but feel free to extend the `FSelectorSequential` class in `mlr3fselect`. Nesting does not work. – be-marc Dec 11 '21 at 08:25
  • 3) The SFFS and SFBS strategies are not available – be-marc Dec 11 '21 at 08:27
  • Your approach seems overly complex. Can you reference a paper in which tuning with irace, SFS followed by SBS and tuning of the feature selection method itself is successfully applied? – be-marc Dec 11 '21 at 08:37
  • Unfortunately, I don't have a paper but for exampling if I am using aft model, I need do FSelectorSequential and tune the dist. Any recommendation how best to do that? When I did the order you suggested, it seems tuned model (at) was run first then autoselector model (afs). I don't think this is the right order. We need to select features then tune parameters for the selected subset feature. In MLR makeFilterWrapper is used then makeTuneWrapper which unfortunately can't be combined with makeFeatSelWrapper I – Ali Alhadab Dec 19 '21 at 16:32
  • Also how can I ensure one inner resample is used for feature selection and tuning which is I think the case for MLR because it is fast then MLR3 by a lot? – Ali Alhadab Dec 19 '21 at 16:39