Differences between adabag and rpart

Question

I've found something strange (at least to me) when using rpart and adabag packages in R (R version 3.5.1 (2018-07-02) -- "Feather Spray")).

I'm wondering what is the reason of obtaining different trees when using both packages even if parametrization is the same. Take a look on code below:

library(rpart); library(adabag);
set.seed(32323)

N<-1000
x<-rnorm(N)
y<-0.6^2*x+sqrt(1-0.6^2)*rnorm(N)
z<-rep(0,N)
for(i in 1:N){
  if(x[i]-y[i]+0.2*rnorm(1)>1.0){
    z[i]=1
  }
}

myData<-data.frame(x,y,z)

tree<-rpart(formula=z ~ .,myData, method="anova", cp=0,maxdepth=10,minbucket=30, xval=10)
plot(tree, uniform=TRUE, compress=TRUE)
text(tree, use.n = FALSE, all=FALSE)
print(tree)

myData.Ada<-myData
myData.Ada$z<-as.factor(myData$z)
adaboost <- boosting(z ~ .,data = myData.Ada, boos = F, mfinal=1, coeflearn="Breiman", control=rpart.control(method="anova", cp=0, maxdepth=10, minbucket = 30, xval=10))
plot(adaboost$tree[[1]], uniform=TRUE, compress=TRUE)
text(adaboost$tree[[1]], use.n = FALSE, all=FALSE)
print(adaboost$tree[[1]])

for me parametrization is the same, but trees are different. As long as I know adabag uses rpart to create trees so what's the reason for this?

Regards Wojtek

rpart creates a single tree, while boosting creates a number of trees where each new tree tries to minimize the prediction error of the previous model. By including the new tree in the overall model the model grows gradually, and thus allows for more accurate predictions than a single tree. — Wolf, Oct 22 '18 at 11:50
Yes, that's why I've put number of trees in adaboost to 1 to compare these two methods. — w.starosta, Oct 22 '18 at 14:20

Differences between adabag and rpart

0 Answers0