0

I am new to decision trees. I am testing rpart function to create a decision tree for a logical variable using continuous as well as factor variables.

The variable I am trying to predict is FALSE 88% of the times. Is it possible to tell rpart to predict TRUE instead or FALSE as that is what I am interested in?

Thanks for your time!

NewML
  • 1
  • I don't understand the issue. If it predicts `false`, then it automatically predicts true when false is not predicted, doesn't it? Could you elaborate - what sort of decision tree you using? Classification I suppose? Also you don't have balanced sample, so you getting `false` predicted a lot, because there is a substantial proportion of falses in your data - true does not win a leaf at all perhaps. Try to visualise the tree if it's not too complex. – Jan Sila Oct 24 '16 at 05:14
  • Hi @JanSila , Thanks for the reply! I am using classification tree and most of my population has **false** instead of **true**. Since its not a balanced sample therefore I want to find features which are important for predict **true** cases not **false** as that will be important when I grow the whole tree and prune it later. – NewML Oct 24 '16 at 06:39
  • 1
    I would 1) [plot the tree](http://blog.revolutionanalytics.com/2013/06/plotting-classification-and-regression-trees-with-plotrpart.html). In general, you cant make it predict `true` when it is not in the end nodes in majority or decent proportion to the alternative. Maybe try different classifier? This is quite [good discussion here](http://stats.stackexchange.com/questions/28029/training-a-decision-tree-against-unbalanced-data). I also think your question is more towards CV than SO, as it is methodological rather than coding :/ – Jan Sila Oct 24 '16 at 07:29

0 Answers0