I have trained a partykit package ctree classification decision tree and I need to calculate classification probabilities for sub tree (not only for leaf nodes). So for example if a sub tree consists of 3 leaf nodes with the following probabilities: leaf 1 (120 observations) : 0.45 leaf 2 (160 observations) : 0.49 leaf 3 (190 observations) : 0.83
for this hypothetical sub tree the weighted average probability would be 120*0.42 + 160*0.49 + 190*0.83 / (120+160+190) = 0.507
and so on I need to traverse on the ctree object and calculate all weighted probabilities for each node recursively.
I have this code:
data(airquality)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
traverse <- function(treenode){
if(treenode$terminal){
bas=paste("Current node is terminal node with",treenode$nodeID,'prediction',treenode$prediction)
print(bas)
return(0)
} else {
bas=paste("Current node",treenode$nodeID,"Split var. ID:",treenode$psplit$variableName,"split value:",treenode$psplit$splitpoint,'prediction',treenode$prediction)
print(bas)
}
traverse(treenode$left)
traverse(treenode$right)
}
which traverse on the tree does not work on partykit object. On the other hand I have this code, which lists all porbabilities for leaf nodes only :
preds.ls <- list(predict(airct , type = "prob"))[1]
pred.probs.df <- unique(as.data.frame((preds.ls[[1]])))
Any suggestions to combine these 2 snippets to a code that will traverse on a PARTYKIT object and calculate this weighted average are appreciated