The "generateFilterValuesData" is a function of the "mlr" package that uses various algorithms for feature selection. I would like to apply the different algorithms in an iterative way over the same data to examine the different features selected by the different methods. For this in principle I should apply the same function over the same data, while varying the "method" argument (about 15 possible values). I provide here a reproducible example (I did not use set.seed() because numerical values are irrelevant per se):
feature_A <- rnorm(200, 5, 2)
feature_B <- rnorm(200, 3, 2)
feature_C <- rnorm(200, 3.7, 1.3)
feature_D <- runif(200)
feature_E <- rpois(200, 1.6)
feature_F <- rpois (200, 7.3)
feature_G <- rlogis(200)
feature_H <- rexp(200, 2)
feature_I <- rexp (200, 3)
test_activ <- as.factor(rbinom(200, 1, 0.5))
df <- data.frame(feature_A, feature_B, feature_C, feature_D,
feature_E, feature_F, feature_G, feature_H,
feature_I, test_activ)
library(mlr)
taskg <- makeClassifTask(data = df, target="test_activ")
fv <- generateFilterValuesData(task, method = c("anova.test"))
methods <- c("anova.test", "auc", "cforest.importance","chi.squared", "gain.ratio", "information.gain",
"kruskal.test", "oneR", "permutation.importance",
"randomForest.importance", "randomForestSRC.rfsrc",
"randomForestSRC.var.select", "ranger.impurity",
"ranger.permutation", "relief", "symmetrical.uncertainty",
"univariate.model.score", "variance")
I would like to iteratively apply the function "generateFilterValuesData" over the taskg task, varying the methods (i.e. iterate over the 18 methods). I tried lapply as following:
lapply (methods, generateFilterValuesData, taskg),
but I get the following error: "Error in lapply(methods, generateFilterValuesData, taskg) : Assertion failed. One of the following must apply: * checkClass(task): Must have class 'ClassifTask', but has class * 'character' * checkClass(task): Must have class 'RegrTask', but has class 'character' * checkClass(task): Must have class 'SurvTask', but has class 'character'"
I realize I do something wrong, but I am not able to find how to perform this iteration over the "methods" vector (as this vector contains varying values for the same argument, not for the data on which the function is to be applied).