0

I just created a Gradient Boosting model whose out-of-sample prediction is worse than the random forest. The MSE of GBM is 10% higher than the random forest. Below is my sample code. I am sure whether there is any wrong with it.

gbm1 <- gbm(as.formula(paste0(Y.idx ,'~', paste0(colnames(rf.tmp.train[c(-1,-2)],collapse=""))),
data=rf.tmp.train,distribution="gaussian",n.trees=3000,         
shrinkage=0.001,interaction.depth=1,bag.fraction = 0.5,          
train.fraction = 1,n.minobsinnode = 10, cv.folds = 10,       
keep.data=TRUE, verbose=FALSE,n.cores=1)
YYY
  • 605
  • 3
  • 8
  • 16

1 Answers1

0

From my working experience, gbm usually perform better than random forest and random forest usually perform better than other algorithms. In your case, you might want to tune the parameters for both gbm and random forest. To start, I recommend caret package which carry out the tuning process automatically.

Cheers

yuanhangliu1
  • 157
  • 1
  • 1
  • 7
  • I suggest that attempting to "answer" a question when there is inadequate information is unwise.This is really only worth a comment since you are agreeing with presumptions of the questioner and not offering any specific advice. – IRTFM Jun 09 '15 at 20:02