0

Does anyone know how many bootstraps or subsamples a standard call to rfsrc performs?

rf1<-rfsrc(Surv(time,status)~., data=myTable)

Also rf1$err.rate which is described as "cumulative OOB error rate" for me at the above settings is a vector of length 1000, with 999 element being NAand only the last element is an error rate (between 0 and 0.5). Is that the expected behaviour? Is this last value the average error of all bootstraps?

Update: I have found a setting block.size, which regulates how many OOB error rates out of the 1000 are returned. If you set it to e.g. 10, every thenth OOB error rate is filled. Hower what I am still not sure about is on how many bootstraps each of these error rates is calculated. Is each simply one error rate from a single bootstrap or subsample or is it somehow averaged?

StanW
  • 93
  • 5

1 Answers1

0

Per the documentation:

sampsize Function specifying size of bootstrap data when by.root is in effect. For sampling without replacement, it is the requested size of the sample, which by default is .632 times the sample size. For sampling with replacement, it is the sample size. Can also be specified by using a number.

So by default, sampling is done without replacement and 63.2 % of observations are sampled randomly for each tree in the forest.

user2974951
  • 9,535
  • 1
  • 17
  • 24