12

I'm using the caret package to analyse Random Forest models built using ranger. I can't figure out how to call the train function using the tuneGrid argument to tune the model parameters.

I think I'm calling the tuneGrid argument wrong, but can't figure out why it's wrong. Any help would be appreciated.

data(iris)

library(ranger)
model_ranger <- ranger(Species ~ ., data = iris, num.trees = 500, mtry = 4,
                       importance = 'impurity')


library(caret)

# my tuneGrid object:
tgrid <- expand.grid(
  num.trees = c(200, 500, 1000),
  mtry = 2:4
)

model_caret <- train(Species  ~ ., data = iris,
                     method = "ranger",
                     trControl = trainControl(method="cv", number = 5, verboseIter = T, classProbs = T),
                     tuneGrid = tgrid,
                     importance = 'impurity'
)
Mark
  • 596
  • 1
  • 4
  • 14

1 Answers1

26

Here is the syntax for ranger in caret:

library(caret)

add . prior to tuning parameters:

tgrid <- expand.grid(
  .mtry = 2:4,
  .splitrule = "gini",
  .min.node.size = c(10, 20)
)

Only these three are supported by caret and not the number of trees. In train you can specify num.trees and importance:

model_caret <- train(Species  ~ ., data = iris,
                     method = "ranger",
                     trControl = trainControl(method="cv", number = 5, verboseIter = T, classProbs = T),
                     tuneGrid = tgrid,
                     num.trees = 100,
                     importance = "permutation")

to get variable importance:

varImp(model_caret)

#output
             Overall
Petal.Length 100.0000
Petal.Width   84.4298
Sepal.Length   0.9855
Sepal.Width    0.0000

To check if this works set number of trees to 1000+ - the fit will be much slower. After changing importance = "impurity":

#output:

             Overall
Petal.Length  100.00
Petal.Width    81.67
Sepal.Length   16.19
Sepal.Width     0.00

If it does not work I recommend installing latest ranger from CRAN and caret from git hub:

devtools::install_github('topepo/caret/pkg/caret')

To train the number of trees you can use lapply with fixed folds created by createMultiFolds or createFolds.

EDIT: while the above example works with caret package version 6.0-84, using the names of hyper parameters without dots works as well.

tgrid <- expand.grid(
  mtry = 2:4,
  splitrule = "gini",
  min.node.size = c(10, 20)
)
missuse
  • 19,056
  • 3
  • 25
  • 47
  • Why do you need to ad the dot prior to turning parameters? – dule arnaux Sep 07 '19 at 11:30
  • 1
    @dule arnaux This is just how the creator of `caret` defined them. – missuse Sep 07 '19 at 13:41
  • 1
    @missuse caret documentation doesn't have the dots in the tuning parameters it describes https://topepo.github.io/caret/available-models.html. And for other models i've run without the dots. So wondering if this is some older legacy way of doing it? – dule arnaux Sep 11 '19 at 13:21
  • @dule arnaux Quite possibly my answer is deprecated. I will check and update the answer . – missuse Sep 11 '19 at 17:17