0

Given that:

data(iris)
fit <- rpart(Species~., iris)
predict(fit)

Does this give a cross-validated prediction of the training data?

I did not find any confirmation for a CV prediction in the rpart documentation.

10x

Guest3290
  • 167
  • 9

1 Answers1

0

Using predict(fit) you get the predicted class probabilities (for classification trees; means for regression trees) on the training data set. The tree that is used for that prediction is what is shown by

fit

## n= 150 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
## 1) root 150 100 setosa (0.33333333 0.33333333 0.33333333)  
##   2) Petal.Length< 2.45 50   0 setosa (1.00000000 0.00000000 0.00000000) *
##   3) Petal.Length>=2.45 100  50 versicolor (0.00000000 0.50000000 0.50000000)  
##     6) Petal.Width< 1.75 54   5 versicolor (0.00000000 0.90740741 0.09259259) *
##     7) Petal.Width>=1.75 46   1 virginica (0.00000000 0.02173913 0.97826087) *

During the fitting of this tree a cross-validation is also carried out, e.g., look at

fit$cptable

##     CP nsplit rel error xerror       xstd
## 1 0.50      0      1.00   1.16 0.05127703
## 2 0.44      1      0.50   0.70 0.06110101
## 3 0.01      2      0.06   0.09 0.02908608

So in this case the fit also had the lowest cross-validation error (see xerror column). On other data sets you may need to apply some additional pruning or use the 1-SE pruning rule etc.

Achim Zeileis
  • 15,710
  • 1
  • 39
  • 49