Assume the training data is "fruit", which I am going to use it for predict using CART model in R
> fruit=data.frame(
color=c("red", "red", "red", "yellow", "red","yellow",
"orange","green","pink", "red", "red"),
isApple=c(TRUE, TRUE, TRUE, FALSE, TRUE, FALSE,
FALSE,FALSE,FALSE,FALSE,TRUE))
> mod = rpart(isApple ~ color, data=fruit, method="class", minbucket=1)
> prp(mod)
Could anyone explain what is exactly the role of minbucket
in plotting CART tree for this example if we are going to use minbucket
= 2, 3, 4, 5?
See i have 2 variables color & isApple. Color variable has green, yellow, pink, orange and Red. is Apple variable has value TRUE or FALSE. In the last example, RED has three TRUE and 2 FALSE mapped with it. Red value appear five times. if i give minbucket = 1,2,3 then it is splitting. If I give minbucket = 4 or 5 then no split occurs though red appears five times.