0

I hope this isn't too vague but I am building a decision tree with a big class imbalance (1% hit rate) and some poor predictors.

The default settings for Rpart don't even split the tree, trying to change the sensitivity (cp) does so something but it seems to be just trying to isolate tiny groups of really hit rate (50%).

I'm happy just to find more general rules in my data that isolate a larger group with a lower (like 5%) hit rate It doesn't need to be better than that.

A low sensitivity and a high min-bucket doesn't seem to help either.

Anything I can do with R part's setting to get these general rules for my data-set?

khhc
  • 93
  • 2
  • 11

1 Answers1

0

With this information, I think I would lower CP and minsplit. If not working, I would oversample the underrespresented class. Hope it helps!

Gere Caste
  • 130
  • 5