I want to create a custom bootstrap function. The reason for this are several.
- Better understanding (or let's just say understanding) of the process
- To extrapolate bootstrap resampling elsewhere without dependancy of packages
I know there are some packages (mainly boot
, rms
, caret
and others) that can help me with my issue and are pretty versatile but I want to be able to create a function by myself for the reasons stated above.
As far as my understanding goes, bootstrap is a resampling method consistent in taking n random samples from a sample (a dataframe in our case). Then using this n random samples to calculate estimates.
So, for example, say I fit a model (whatever, it really doesn't matter for my "example" code)
model <- coxph(Surv(time, cens)~groups, data=df)
I used survival because that's where I want to apply this right now, but because I'm interested on understanding what's really happening it really does not matter which model we choose.
Now, let's "resample". Theory-wise this is what I understand every time I read something about bootstrap
bstrap <- sample(df, 1000, replacement=T)
preds <- predict(model, bstrap)
mean(preds)
confint(preds) #This is probably the "faultiest" part, as C.I are supposed to be calculated by the bootstrap itself
Would something like this work? I can see some faulty things in there, but that's where my intuition on the topic drives me to think based on what I've read about bootstrap. Why wouldn't that work? Can it be something with the fact that I'm using exactly the same data I have fitted my model with? Is because the resampling is not so literal? Something else?
Many thanks!