I'm using the caret package for a tree model. I understood that caret uses CV to find the optimal tuning parameter for pruning the tree.
This is the code I use:
id2 <- sample(1:nrow(data),2/3*nrow(data))
#learn
app <- data[id2,]
#test
test <- data[-id2,]
ctrl<-trainControl(method="cv", number=8,classProbs=TRUE, summaryFunction=twoClassSummary)
mod0 <- train(class~., data=app,method="rpart",trControl=ctrl,metric="ROC")
plot(mod0)
plot(mod0$finalModel,uniform=TRUE,margin=.1);text(mod0$finalModel,cex=0.8)
Here is my data: https://drive.google.com/open?id=1xrCXTLqKvGiGeo2X0Y1DvoSKvzbYFnyccLimceDIbZg
But everytime I run the code I get trees of different complexities (because of CV?) and the tree is not really pruned but very complex and a lot of terminal nodes.
How can I get a less complex tree ?