4

I am using rpart function to get a decision tree to predict Owner / No-owner based on a set of variables .Below is the excerpt of output

Node number 2: 8 observations,    complexity param=0.08333333   predicted class=non-owner  expected loss=0.125  P(node) =0.3333333
class counts:     7     1    probabilities: 0.875 0.125 

From the above output, it can be observed that Node 2 is labelled as predicted class of "Non-owner" as the class counts (probability of 0.875) are 7 for non-owner in this node. I want to change the default threshold based on which a node is labelled. To elaborate , assuming default 0.5 - if the probability (class count of non-owners is more than 1/2 of data points) is > 0.5 then the node is labelled as non-owner- but I want to change that to 0.9 i.e., only when the class count of non-owners is more than 9 out of 10 (ratio) the node should be labelled as non-owner , otherwise owner.

Sanjeev G
  • 49
  • 1

0 Answers0