I am using the randomForest
package to classify a binary outcome variable with the standard process. I first had to force a change on all variables to make sure they were numeric and then used na.roughfix
to handle missing values:
data <- read.csv("data.csv")
data <- lapply(data, as.numeric)
data <- na.roughfix(data)
Then i run the model:
model <- randomForest(as.factor(outcome) ~ V1 + V2...+ VN,
data=data,
importance=TRUE,
ntree=500)
and I get the following error:
Error in na.fail.default(list(as.factor(outcome) = c(2L, 2L, 1L, : missing values in object
The na.roughfix imputation should have taken care of this (I have gotten it to work before and research on here shows that it should work) , right? Any suggestions?