1

Can someone provide a work flow about this? For instance, suppose I am doing binary classification,

For each iteration of the algorithm:

  1. Randomly sample k*N rows, where k is the bag.fraction, and N is nrow(dataset).

  2. Build a classifier using this training sample, suppose we use classification tree here.

  3. Predict the terminal node class label.

This how boosting is done without a CV. If I want to do a 3-fold CV, where exactly do I apply it? Between step 1 and 2 or after step 3? Thanks!

Boxuan
  • 4,937
  • 6
  • 37
  • 73
  • To be more specific, is the cross validation applied to each tree? or is it applied to the entire gbm algorithm? I hope this is clear. – Boxuan Mar 14 '13 at 14:19
  • This *might* be more suited for http://stats.stackexchange.com/ – NPE Mar 14 '13 at 14:19

0 Answers0