1

Code:

ranger(outcome~., data, num.trees=500, probability=TRUE)

Error: Missing data in columns

Is there a format that the data needs to be in? How to get past this error?

Werner Hertzog
  • 2,002
  • 3
  • 24
  • 36
helicon
  • 23
  • 3

1 Answers1

1

You need to remove NAs Example:

ranger(outcome~., data[complete.cases(data),], num.trees=500, probability=TRUE)

Other methods use packages like mice or miceFast to impute (fill NA). Other simple solution to impute the data with random data (from each column).

data_cs = data.frame(Map(function(x) Hmisc::impute(x,'random'), data))
ranger(outcome~., data_cs, num.trees=500, probability=TRUE)
polkas
  • 3,797
  • 1
  • 12
  • 25