0

I'm working with ranger, a fast implementation of Random Forests. The problem is I have no idea how to interpret the $forest component of the result. The document simply says

forest: Saved forest (If write.forest set to TRUE). Note that the variable IDs in the split.varIDs object do not necessarily represent the column number in R.

Well, that isn't really helpful, so I tried inspecting its components myself, by their names are not self-explanatory.

> names(ranger(Species ~ ., data = iris)$forest)
 [1] "dependent.varID"            "num.trees"
 [3] "child.nodeIDs"              "split.varIDs"
 [5] "split.values"               "is.ordered"
 [7] "class.values"               "levels"
 [9] "independent.variable.names" "treetype"

Some components like num.trees are trivial to understand, but things like child.nodeIDs are really mind-blowing.

> ranger(Species ~ ., data = iris)$forest$child.nodeIDs[[1]]
[[1]]
 [1]  1  3  5  0  7  9 11  0  0  0 13 15  0  0  0  0  0

[[2]]
 [1]  2  4  6  0  8 10 12  0  0  0 14 16  0  0  0  0  0

Is it documented somewhere?

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
nalzok
  • 14,965
  • 21
  • 72
  • 139

1 Answers1

1

See the documentation for the ranger::treeInfo function: https://www.rdocumentation.org/packages/ranger/versions/0.11.2/topics/treeInfo

user1808924
  • 4,563
  • 2
  • 17
  • 20