I am trying to create a decision tree using the rpart package in R. To arrive at the optimal depth for the tree I am using the plotcp
function. When I use printcp
to analyze the results of the cross validation, among other details, I get the following message:
Root node error: 3599.8/14399 = 0.25
My classes are unbalanced (Class 1-75%,Class 2-25%). So what rpart seems to be doing, is to use a default threshold of 0.5. And since none of the nodes have a prob > 0.5 for class C2 they are all getting classified as C1.
Is it not possible for me to specify the probability threshold? Say, for e.g, if prob > 0.35 for C2, classify it as C2.