1

I am working in a project which requires to use Rpart package in R to build a decision tree.

However, my direct report manager is interested in a specific variables to start as the root node in the tree as he value that specific variable as significant in our business.

May I know anyone has the knowledge in how to force the tree to start with that specific variable? for instance, that variable name is X.

Description of the dataset: target variable Y. with 30 other independent variables.

Code:

tree <- rpart(Y~., method="class", data=train, parms=list(split="information"), control=rpart.control(cp=.0002,minsplit = 5,minbucket = 5,maxdepth = 10))
Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
Jason
  • 9
  • 3
  • What do you need? Are the splitting rules enough? Do you need a picture of the tree? Do you need to be able to cross-validate or is the training error good enough? – G5W Sep 28 '21 at 23:11
  • Hi, every parameters is good enough. I am able to build the tree and able to generate the plot. However, the 1st root node of the tree I want to force it to start with X. May I know if you have a clue to how to do that in codes? Thank you – Jason Sep 29 '21 at 04:22

1 Answers1

0

That's not possible to do as it defies the logic of the algorithm used. You have 2 options:

  • Do the split manually by creating 2 decision tree models with data that you filter beforehand.
  • Use a different decision tree model that would allow this, such as lightgbm (be aware that the algorithm is different).
anymous.asker
  • 1,179
  • 9
  • 14
  • Thank you for the suggestion. I am new to R coding, thus in terms of your 1st suggestion.: "Do the split manually by creating 2 decision tree models with data that you filter beforehand". How do I accomplish that in coding? For instance: there are 3 levels in X. The first step is it: tree <- rpart(Y~X, method="class", data=train, maxdepth = 1). Then what should I do next step? And how do I combine them back later and do the prune? Thank you! – Jason Sep 29 '21 at 18:03
  • Or as you mentioned. If I would like to use lightgbm method, I have read about this package but still unable to find the parameter in the package that can force it. Any link or sample code will be very grateful. Much thanks – Jason Sep 29 '21 at 18:28