0

So I'm working on a project with decision trees, and I need to know how to get the percentages for each of the nodes.

Here is my approximate code:

fit <- rpart(Y ~ a + b + c, method = "class", data = example, control = rpart.control(minsplit=5))

My main question is how do I get the percentage estimation for each of the rows in example from the rpart output.

I have looked at the answer given in the following post How to get percentages from decision tree for each node, and from my understanding (please correct me if I'm wrong), 33.3% of the data belong in class 2, 36% in class 4, and 30.67% in class 5. But what my question differs is that I need to know is what is the percent chance of an entry being in class 2, what percent chance in class 4, etc.

Any help is appreciated. Thanks!

Community
  • 1
  • 1
J Kang
  • 1
  • 5
  • how is your question different? please add a reproducible example – rawr Sep 26 '16 at 23:01
  • If I understand correctly, you would just want the predicted/fitted values from the tree. `predict(fit, example)` should give you this. For example, if I do `mod = rpart(Species ~ ., data = iris)`, and then `predict(mod)`, it will give the class probabilities for each row in the `iris` dataset. – jav Sep 26 '16 at 23:02

1 Answers1

0

We have an environment where we cannot use the R rattle library which includes fancyRpartPlot - a wrapper for plotting rpart trees using prp, so I found the way to do this was to use a function to access the rpart object frame as follows:

rm(list = ls())
library(rpart)
node.fun <- function(x, labs, digits, varlen) {    
    avg  <- sprintf("%0.1f", x$frame$yval)
    pct   <- sprintf("%1.1f%%",100*x$frame$wt/x$frame$wt[1]) 
    rows <- format(x$frame$n, big.mark=",")
    paste0(avg, "\n", " n=", rows,"   ", pct)
}
fit <- rpart(skips ~ Opening + Solder + Mask + PadType + Panel, data = solder, method = "anova")
rpart.plot::prp( fit, main="Formatted averages (no scientific notation) and percentages calculated", varlen=0, faclen=0,fallen.leaves=TRUE, shadow.col="gray", nn=TRUE, type = 4, extra = 101, box.palette="Greens", compress=TRUE, tweak=1,node.fun = node.fun)

Which gives a rpart plot / tree image that is similar to:

library(rattle)
fancyRpartPlot(fit)
calycolor
  • 726
  • 1
  • 7
  • 19