In the PST package we use the value C
as a cut-off for the information gain function used to prune the tree. The C
value, for an alpha of 0.05 is calculated as follows:
C95 <- qchisq(0.95, 1) / 2
What does it mean that the C
value is based on an alpha of 0.05? Does it mean we need to be at least 95% certain that an additional node adds more information compared to previous nodes, in order for it to be retained by the pruning algorithm?