0

So I posted a thread about this problem, but it got on hold. So I rephrased so it can be it a programming question. This is my code below. I am trying to find the stimulated confidence level of a sample using the bootstrap.

# Step One: Generating the data from lognormal distribution

MC <-1000; # Number of samples to simulate
xbar = c(1:MC);
mu = 1;
sigma= 1.5;
the_mean <- exp(mu+sigma^2/2);
n= 10;

for(i in 1:MC)
 {
mySample <- rlnorm(n=n meanlog=mu, sdlog=sigma);
xbar [i] <- the_mean(mySample);
 }

# Step Two: Compute 95% Bootstrap CI with B=1000

B = 1000
xbar_star = c(1:B)
for(b in 1:B)
{ 
 x_star = sample(n,n, replace=TRUE)
 xbar_star[b] = mean(x_star)
 }

quantile(xbar, p=c(0.025, 0.975))

If you implement this code you can see that the output is 975.025 when it should actually be 0. 90. I don't understand why my output is wrong.


We arent trying to find the Confidence Interval, but the stimulated Confidence Level. How does the actual coverage percentage (obtained through simulation) compare with the nominal confidence level (which is 95%)? This is my code when my samples were given in a practice problem...


library(boot)
x = c(0.22, 0.23, 0.26, 0.27, 0.28, 0.28, 0.29,
  0.33, 0.34, 0.35, 0.38, 0.39, 0.39, 0.42, 0.42,
  0.43, 0.45, 0.46, 0.48, 0.5, 0.5, 0.51, 0.52,
  0.54, 0.56, 0.56, 0.57, 0.57, 0.6, 0.62, 0.63,
  0.67, 0.69, 0.72, 0.74, 0.76, 0.79, 0.81, 0.82,
  0.84, 0.89, 1.11, 1.13, 1.14, 1.14, 1.2, 1.33)

B = 10000
xbar = mean(x)
n = length(x)
xbar_star = c(1:B)
for(b in 1:B)
{
x_star = sample(x=x, size=n, replace=TRUE)
xbar_star[b] = mean(x_star)
}

# empirical percentile method

 quantile(xbar_star, p=c(0.025, 0.975))

> quantile(xbar_star, p=c(0.025, 0.975))
  2.5%     97.5% 
0.5221277 0.6797926 
Siguza
  • 21,155
  • 6
  • 52
  • 89
user3295513
  • 43
  • 1
  • 5
  • What kind of bootstrap do you want? A non-parametric one could be done with `sample(n, n, replace = TRUE)` or the **boot** package which comes with R. – Gavin Simpson Mar 04 '14 at 03:06
  • So why are you sampling 1000 sets of data using `rlnorm()`? In a non-parametric one, you have 1 sample of data and you take B bootstrap samples from that one sample. Compute mean on each of those B bootstrap samples and then take appropriate quantiles to give you a percentile CI. – Gavin Simpson Mar 04 '14 at 03:14
  • 2
    Please never include `rm(list = ls())` at the top of your code. It can cause people to unintentionally clear their sessions. – nograpes Mar 04 '14 at 03:15
  • Did you *even* look at what `sample` gave you? You need to use those values to index the 1 draw from `rlnorm()` to select samples from the observed sample. – Gavin Simpson Mar 04 '14 at 03:18
  • What do you think `the_mean(mySample)` is doing? According to your code, `the_mean()` isn't a function but you are using like one. – Gavin Simpson Mar 04 '14 at 03:20
  • Re you "Answer", you seem to be conflating (non-parametric) bootstrap and simulation. If you want to simulate you could draw `sims <- matrix(rlnorm(n * MC, mu, sigma), nrow = MC, ncol = n)`. Then do with each row what you were trying to do with `the_mean`. – Gavin Simpson Mar 04 '14 at 03:49

0 Answers0