I would like a way to turn an rpart tree object into a nested list of lists (a dendrogram). Ideally, the attributes in each node will include the information in the rpart object (impurity, variable and rule that is used for splitting, the number of observations funneled to that node, etc.).
Looking at the rpart$frame object, it is not clear to me how to read it. Any suggestions?
Tiny example:
library(rpart)
fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)
fit$frame
var n wt dev yval complexity ncompete nsurrogate yval2.V1 yval2.V2 yval2.V3 yval2.V4 yval2.V5 yval2.nodeprob
1 Start 81 81 17 1 0.17647059 2 1 1.00000000 64.00000000 17.00000000 0.79012346 0.20987654 1.00000000
2 Start 62 62 6 1 0.01960784 2 2 1.00000000 56.00000000 6.00000000 0.90322581 0.09677419 0.76543210
4 <leaf> 29 29 0 1 0.01000000 0 0 1.00000000 29.00000000 0.00000000 1.00000000 0.00000000 0.35802469
5 Age 33 33 6 1 0.01960784 2 2 1.00000000 27.00000000 6.00000000 0.81818182 0.18181818 0.40740741
10 <leaf> 12 12 0 1 0.01000000 0 0 1.00000000 12.00000000 0.00000000 1.00000000 0.00000000 0.14814815
11 Age 21 21 6 1 0.01960784 2 0 1.00000000 15.00000000 6.00000000 0.71428571 0.28571429 0.25925926
22 <leaf> 14 14 2 1 0.01000000 0 0 1.00000000 12.00000000 2.00000000 0.85714286 0.14285714 0.17283951
23 <leaf> 7 7 3 2 0.01000000 0 0 2.00000000 3.00000000 4.00000000 0.42857143 0.57142857 0.08641975
3 <leaf> 19 19 8 2 0.01000000 0 0 2.00000000 8.00000000 11.00000000 0.42105263 0.57894737 0.23456790
(the function ggdendro:::dendro_data.rpart might be helpful somehow, but I couldn't get it to really solve the problem)