I want to prepare a data set to use it in a Task
of the mlr
package. Should binary factor independent variables be of class factor, logical, character, or integer? Is it OK to have factor variables with more than 2 classes as factor/character or are there models integrated in mlr which require e.g. a model matrix where mlr doesn't automatically do the conversion? Which classes does mlr expect for those cases?
For example:
x1 <- factor(sample(0:1, size=10, replace = TRUE))
x2 <- factor(sample(letters[1:5], size=10, replace = TRUE))
y <- sample(c("yes", "no"), size=10, replace = TRUE)
library(mlr)
makeClassifTask(data = data.frame(y, x1, x2), target = "y", positive="yes")