0

I want to get the means and sds across 20 sampled data, but not sure how to do that. My current code can give me the means within each sample, not across samples.

## create data
data <- round(rnorm(100, 5, 3))
data[1:10]
## obtain 20 boostrap samples
## display the first of the boostrap samples

resamples <- lapply(1:20, function(i) sample(data, replace = T))

resamples[1]


## calculate the means for each bootstrap sample
r.mean <- sapply(resamples, mean)
r.median
## calculate the sd of the distribution of medians 
sqrt(var(r.median))

From the above code, I got 20 means from each of the sampled data, and sd of the distribution of the means. How can I get 100 means, each mean from the distribution of the 20 samples? and same for the standard deviation?

Many thanks!!

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
mandy
  • 483
  • 9
  • 20
  • 1
    If you have 20 samples how do you propose to get 100 means? Can you please make the question more clear? In the mean time, take a look at base package `boot` function `boot`. – Rui Barradas May 31 '18 at 16:34

2 Answers2

4

Though the answer by @konvas is probably what you want, I would still take a look at base package boot when it comes to bootstrapping.

See if the following example can get you closer to what you are trying to do.

set.seed(6929)    # Make the results reproducible
data <- round(rnorm(100, 5, 3))

boot_mean <- function(data, indices) mean(data[indices])
boot_sd <- function(data, indices) sd(data[indices])

Runs <- 100
r.mean <- boot::boot(data, boot_mean, Runs)
r.sd <- boot::boot(data, boot_sd, Runs)

r.mean$t
r.sd$t

sqrt(var(r.mean$t))
#          [,1]
#[1,] 0.3152989

sd(r.mean$t)
#[1] 0.3152989

Now, see the distribution of the bootstrapped means and standard errors.

op <- par(mfrow = c(1, 2))
hist(r.mean$t)
hist(r.sd$t)
par(op)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
3

Make a matrix with your samples

mat <- do.call(rbind, resamples)

Then

rowMeans(mat)

will give you the "within sample" mean and

colMeans(mat) 

the "across sample" mean. For other quantities, e.g. standard deviation you can use apply, e.g. apply(mat, 1, sd) or functions from the matrixStats package, e.g. matrixStats::rowSds(mat).

konvas
  • 14,126
  • 2
  • 40
  • 46