0

I have data sets for two groups, with one being much smaller than the other. For that reason, I am using the MatLab bootstrapping function to estimate the performance of the smaller group. I have code that draws on my original data, and it generates 1000 'new' means. However, it is not clear as to how many of the original data points are used each time. Obviously, if all the original data was used, the same mean would continue to be generated.

Can anyone help me out with this?

Siguza
  • 21,155
  • 6
  • 52
  • 89

1 Answers1

0

Bootstrapping comes from sampling with replacement. You'll use the same number of points as the original data, but some of them will be repeated. There are some variants of bootstrapping which work slightly differently, however. See https://en.wikipedia.org/wiki/Bootstrapping_(statistics).

Patrick Mineault
  • 741
  • 5
  • 11
  • Thanks! I'm specifically talking about the bootstrap (bootstrp) function in matlab though. It obviously chooses a certain number of original data points each time it creates a new set, so I was just wondering if there was a fixed number (or percentage) of the data used for each iteration? – Rebecca Jul 02 '15 at 02:16
  • I don't think you read my answer. It samples the same number points as in the original data, but with replacement. – Patrick Mineault Jul 02 '15 at 16:36
  • Thanks - I understand what you mean now. Thanks for taking the time to answer my question. – Rebecca Jul 03 '15 at 00:51