0

I am building a regressor using decision trees. I am trying to find the best way to get a perfect combination of the four main parameters I want to tune: Cost complexity, Max Depth, Minimum split, Min bucket size

I know there are ways to determine Cost complexity (CP) parameter but how to determine all 4 which I want to use so that the end result has the least error?

Reproducible example below:

library(rpart)

set.seed(1234)
train_index <- sample(nrow(Boston),0.75*nrow(Boston))
boston_train <- Boston[train_index,]   
boston_test <- Boston[-train_index,]

prune_control <- rpart.control(maxdepth = 5, cp = 0.005, minbucket = 20,minsplit =20 ) #numbers are just representative having no real significance
boston.rpart <- rpart(medv ~ .,data = boston_train, method = "anova", control = prune_control)


train_pred <- predict(object = boston.rpart)
test_pred <- predict(boston.rpart, boston_test)
desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • I'm not very familiar with R, but coming from the Python side of things, have you tried a grid search or a random search of the best combination of the 4 hyper parameters? You can define ranges and then use searches. – Vaibhav Mehrotra Sep 10 '20 at 11:43
  • I did try replicating the grid search which is in Python but somehow the values gave worse results than without specifying any hyperparameters at all. – user12897935 Sep 10 '20 at 12:15
  • In this case, I'd suggest you shuffle your data, do a test train split again, and check the CV score while training with the new hyper parameters. You can also choose to fix some hyper parameters – Vaibhav Mehrotra Sep 10 '20 at 12:30
  • @user12897935 Don't use GridSearch for its a brute force method. Its ineffecient for high-dimensional data. Since the number of evaluations increases exponentially as the number of hyper-parameters grows. Don't use Random Search too, because of a large number of unnecessary function evaluations since it does not exploit the previously well-performing regions. I'd suggest **Bayesian Optimisation (BO)** because unlike GS and RS, BO determines the future evaluation points based on the previously-obtained results. – mnm Sep 21 '20 at 15:02

0 Answers0