1

The "generateFilterValuesData" is a function of the "mlr" package that uses various algorithms for feature selection. I would like to apply the different algorithms in an iterative way over the same data to examine the different features selected by the different methods. For this in principle I should apply the same function over the same data, while varying the "method" argument (about 15 possible values). I provide here a reproducible example (I did not use set.seed() because numerical values are irrelevant per se):

feature_A <- rnorm(200, 5, 2)
feature_B <- rnorm(200, 3, 2)
feature_C <- rnorm(200, 3.7, 1.3)
feature_D <- runif(200)
feature_E <- rpois(200, 1.6)
feature_F <- rpois (200, 7.3)
feature_G <- rlogis(200)
feature_H <- rexp(200, 2)
feature_I <- rexp (200, 3)
test_activ <- as.factor(rbinom(200, 1, 0.5))

df <- data.frame(feature_A, feature_B, feature_C, feature_D, 
             feature_E, feature_F, feature_G, feature_H, 
             feature_I, test_activ)
library(mlr)
taskg <- makeClassifTask(data = df, target="test_activ")
fv <- generateFilterValuesData(task, method = c("anova.test"))

methods <- c("anova.test", "auc", "cforest.importance","chi.squared", "gain.ratio", "information.gain", 
         "kruskal.test", "oneR", "permutation.importance", 
         "randomForest.importance", "randomForestSRC.rfsrc",
         "randomForestSRC.var.select", "ranger.impurity", 
         "ranger.permutation", "relief", "symmetrical.uncertainty",
         "univariate.model.score", "variance")

I would like to iteratively apply the function "generateFilterValuesData" over the taskg task, varying the methods (i.e. iterate over the 18 methods). I tried lapply as following:

lapply (methods, generateFilterValuesData, taskg),

but I get the following error: "Error in lapply(methods, generateFilterValuesData, taskg) : Assertion failed. One of the following must apply: * checkClass(task): Must have class 'ClassifTask', but has class * 'character' * checkClass(task): Must have class 'RegrTask', but has class 'character' * checkClass(task): Must have class 'SurvTask', but has class 'character'"

I realize I do something wrong, but I am not able to find how to perform this iteration over the "methods" vector (as this vector contains varying values for the same argument, not for the data on which the function is to be applied).

RAN
  • 85
  • 8

1 Answers1

1

I think you may be looking for this (based off your fv assignment).

lapply(methods, function(m) generateFilterValuesData(taskg, method = m))

When you do

lapply (methods, generateFilterValuesData, taskg)

the arguments are getting switched. For instance, you get the same error if you do

generateFilterValuesData(methods[1], taskg)
mickey
  • 2,168
  • 2
  • 11
  • 20
  • 1
    Sorry for accepting so late this answer, it worked like a charm. It just happened that I did not have access to the computer for about one week then, and when I came back to it I've just forgotten. Many thanks and my sincere apologies! – RAN Jun 15 '19 at 13:48