I want to run a Boruta algorithm that uses the importance of a random forest made with the ranger
function. However, when using the code below, I get the error "Error in getImp(cbind(x[, decReg != "Rejected"], xSha), y, ...)
: could not find function "getImp""
If I run the code without the getImp
argument, it runs fine, but in that case it uses a default value for getImp
which is not what I prefer. How can I pass the importance from my custom ranger function correctly to the Boruta function? btw, ChatGPT can't fix it ;-)
Documentation from R help:
getImp
the function used to obtain attribute importance. The default is getImpRfZ, which runs random forest from the ranger package and gathers Z-scores of mean decrease accuracy measure. It should return a numeric vector of a size identical to the number of columns of its first argument, containing an important measure of respective attributes. Any order-preserving transformation of this measure will yield the same result. It is assumed that more important attributes get higher importance. +-Inf are accepted, NaNs and NAs are treated as 0s, with a warning.
rf_ranger <- ranger::ranger(group ~ .,data = dat,
num.trees=10000,
splitrule='extratrees',
min.node.size=1,
importance = 'impurity',
mtry = 2)
ranger_imp <- rf_ranger$variable.importance
matrix_ranger_importance <- as.matrix(ranger_imp)
colnames(matrix_ranger_importance) <- "MeanDecreaseGini"
boruta.model <- Boruta(group ~ ., #outcome & predictors
data = inputdata,
pValue = 0.01,
doTrace = 2, # verbosity level
maxRuns = 100,
getImp = matrix_ranger_importance)
Sample data:
dat <- data.frame(group = sample(factor(c("active", "control")), 10, replace = TRUE),
v1 = sample(c(0,1),10, replace = TRUE),
v2 = sample(c(0,1),10, replace = TRUE),
v3 = sample(c(0,1),10, replace = TRUE),
v4 = sample(c(0,1),10, replace = TRUE),
v5 = sample(c(0,1),10, replace = TRUE))