makeStackedLearner in mlr is only available for regression, classification and multi-label classification models. Is there any reason why it could not be applied to survival models, perhaps for example, for a simple averaging of the results of several different survival models? I would like to be able to do this within a resampling loop.
I am curious as to the reason for excluding survival models and wondering what I am missing.
This is what I would like to be able to achieve, including parallelization if possible:
library(survival)
library(mlr)
library(parallelMap)
data(veteran)
set.seed(24601)
task_id = "MAS"
mas.task <- makeSurvTask(id = task_id, data = veteran, target = c("time", "status"))
mas.task <- createDummyFeatures(mas.task)
inner = makeResampleDesc("CV", iters=2, stratify=TRUE) # Tuning
outer = makeResampleDesc("CV", iters=2, stratify=TRUE) # Benchmarking
cox.lrn <- makeLearner(cl="surv.coxph", id = "coxph", predict.type="response")
glmboost.lrn <- makeLearner(cl="surv.glmboost", id = "glmBoost", predict.type="response", use.formula=TRUE, center=TRUE)
rfsrc.lrn <- makeLearner(cl="surv.randomForestSRC", id = "rfsrc", predict.type="response")
stacked.lrn <- makeStackedLearner(base.learners = list(cox.lrn, glmboost.lrn, rfsrc.lrn),
predict.type="response",
method='average')
parallelStart(mode="multicore", cpus=12, level="mlr.resample", show.info = TRUE, logging=TRUE)
learners = list( stacked.lrn )
bmr = benchmark(learners=stacked.lrn, tasks=mas.task, resamplings=outer, measures=list(cindex), show.info = TRUE)
parallelStop()
This gives the error:
Error in checkPredictLearnerOutput(.learner, .model, p) :
predictLearner for stack has returned a class factor instead of a numeric!
because it is being treated as a classification task due to the following code in makeStackedLearner:
td = getTaskDesc(task)
type = ifelse(td$type == "regr", "regr",
ifelse(length(td$class.levels) == 2L, "classif", "multiclassif"))