0

Issues with evaluating ranger. In both, unable to subset the data (want the first column of rf.trnprob)

rangermodel= ranger(outcome~., data=traindata, num.trees=200, probability=TRUE)
rf.trnprob= predict(rangerModel, traindata, type='prob')


trainscore <- subset(traindata, select=c("outcome"))
trainscore$score<-rf.trnprob[, 1]  

Error:

incorrect number of dimensions

table(pred = rf.trbprob, true=traindata$outcome)

Error:

all arguments must have the same length

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
helicon
  • 23
  • 3
  • Try `trainscore$score<-rf.trnprob$predictions[, 1]`. The output of `predict` is not just a matrix of probabilities. See `?predict.ranger`. – nicola Sep 25 '20 at 08:26

1 Answers1

0

Seems like the predict function is called wrongly, it should be response instead of type. Using an example dataset:

library(ranger)
traindata =iris
traindata$Species = factor(as.numeric(traindata$Species=="versicolor"))
rangerModel = ranger(Species~.,data=traindata,probability=TRUE)
rf.trnprob= predict(rangerModel, traindata, response='prob')

Probability is stored here, one column for each class:

head(rf.trnprob$predictions)
             0           1
[1,] 1.0000000 0.000000000
[2,] 0.9971786 0.002821429
[3,] 1.0000000 0.000000000
[4,] 1.0000000 0.000000000
[5,] 1.0000000 0.000000000
[6,] 1.0000000 0.000000000

But seems like you want to do a confusion matrix, so you can get the predictions by doing:

pred = levels(traindata$Species)[max.col(rf.trnprob$predictions)]

Then:

table(pred,traindata$Species)
pred   0   1
   0 100   2
   1   0  48
StupidWolf
  • 45,075
  • 17
  • 40
  • 72