I try to understand more about the AUC filter of mlr3filters. I use a classification task (task
) and the following code:
filter = flt("auc")
filter$calculate(task)
result<-as.data.table(filter)
From the documentation in mlr3measures::auc(), I understand that I need a vector of probabilities and a vector with (binary) factor values as well as the "true" class. In my task, I have the binary class (as "target") and many features which are numeric, but not between 0 and 1, so I cannot interpret them as a probability.
Why is AUC calculated then? Or is there an additional assumption? My problem is that I cannot read this from filter$help()
.
As a general question: Is there an additional "explanation layer" between the function references in https://mlr3filters.mlr-org.com/reference/index.html and the underlying R functions? For example, I understand that FilterVariance$new() generates a filter object that calculates the variances of the single features by only using these features and applying stats::var(). But from the book, I also see that I can specify cutoff values:
po("filter", mlr3filters::FilterVariance$new(), filter.frac = 0.5)
Where do I find details about this filter.frac value? I cannot find it in filter$help()
and also not in stats::var()
Similarly, I understand that FilterCorrelation$new() generates a filter object that takes the single features and the target to calculate the feature ranks. This may be self-explanatory, but I wonder where I could find more details about such issues.
I tried the answer that I found here (Filtering in mlr3filters - where can I find details about the methods?) , but I could not find details in filter$help()
Thanks in advance for beginners$help()