I have a data frame df with a X column with normally distributed values along 1,000,000 rows. The max value in X = 0.8. Using R (and perhaps the "boot" package), I would like to do bootstrapping with replacement to estimate how unlikely is to get max(df$X)=0.8 from my data. For this, I could take n bootstrap samples from X and calculate the max value of each sample. Then I can take the standard deviation of each max(sample) and see how far is 0.8 from this st dev. Does anyone know how to do this bootstrapping with R?. Any suggestion is welcomed !
Asked
Active
Viewed 490 times
-1
-
https://www.statmethods.net/advstats/bootstrapping.html or countless other tutorials would be a good starting point. – thelatemail Jun 28 '18 at 02:25
-
I'm not sure I understand your question. If `X` is normally distributed, then the probability that `max(X) = 0.8` is zero. – Maurits Evers Jun 28 '18 at 03:38
1 Answers
1
Bootstrapping from x
, where x is a normal random variable. statistic
function needs to be provided which requires at least data
and indices
as its arguments. check the R documentation of boot
package for more details.
max_x
function below checks if the max(x) is same as maximum of a bootsrapped sample. Note that the test data (x) considered in below code has a different maximum value, but conceptual framework remains the same:
set.seed(101)
x <- rnorm(1000, mean= 0.4, sd= 0.2) # normally distributed test data
max_x <- function(data, indices){ m <- max(data[indices])
if (m == max(x)) { return(1)
} else{ return(0)}
}
results <- boot(data = x, statistic = max_x, R = 1000) # 1000 replications
mean(results$t == 1) # probability of max getting sampled
# 0.618
results
# ORDINARY NONPARAMETRIC BOOTSTRAP
# Call:
# boot(data = x, statistic = max_x, R = 1000)
# Bootstrap Statistics :
# original bias std. error
# t1* 1 -0.382 0.4861196
plot(results)

Mankind_008
- 2,158
- 2
- 9
- 15