Questions tagged [statistics-bootstrap]

In statistics, a bootstrap is a resampling technique based on random sampling with replacement.

The bootstrap was introduced by Brad Efron in the Late 1970s. It is a computer‐intensive method that enables enables researchers to estimate the sample statistics (such as medians, variances, percentiles) by drawing randomly with replacement from a set of available data.

See also:

  1. The Wikipedia page on Bootstrapping
  2. Bootstrapping using boot package in R
  3. Brad Efron's paper on bootstrap
  4. Review on bootstrap methods in econometrics
602 questions
0
votes
1 answer

Does as.svrepdesign inherit the fpc from a svydesign object?

I'm slightly confused by the as.svrepdesign function's use of the fpc from a design object. The example from the documentation shows the following: ## one-stage cluster sample dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc) ##…
Union find
  • 7,759
  • 13
  • 60
  • 111
0
votes
1 answer

Bootstrap with rbinom in R takes too long to run

I have been running bootstraps with rbinom for loops in R, but they take too long to run. I want to perform the bootstrap on a dataset with 1,500,000 rows. I want to resample the rows and for each of the resampled rows rbinom two probabilities…
0
votes
0 answers

Why do I get same p.adjusted both for Benjamini Hochberg and Bonferroni correlation test

Hello statisticians and R experts, Why do I get the same results for BH and Bonferroni tests? my code: dpnd<-c(rnorm(25,mean = 0),rnorm(25,mean=1)) indpnd<-c(rep("a",25),rep("b",25)) bonf <- pairwise.t.test(dpnd, indpnd,…
Chemokine1
  • 17
  • 3
0
votes
0 answers

Time consumption of SciPy's bootstrap as a function of the number of resamples

I have a large dataset, with on the order of 2^15 entries, and I calculate the confidence interval of the mean of the entries with scipy.stats.bootstrap. For a dataset this size, this costs about 6 seconds on my laptop. I have a lot of datasets, so…
Georg
  • 113
  • 4
0
votes
0 answers

R: Error in t.star[r, ] <- res[[r]] : number of items to replace is not a multiple of replacement length

I'm working on an assignment with the following prompt: 1b) Create a function which will take a vector of numeric values, and an integer n as arguments. The function will perform nonparametric bootstrapping on this data, n times, calculating the…
0
votes
1 answer

ANOVA on linear model with bootstrap clustered standard errors

I need to conduct an analysis of variance ANOVA comparing a linear model obtained through a standard OLS regression and one with heteroscedasticity robust standard errors obtained through a bootstrap cluster method. While conducting ANOVA on the…
orpr0
  • 15
  • 2
0
votes
0 answers

Trying to create function operates like a statistic of interest

I am a uni student doing stats, currently looking at creating functions in r. I really struggle with functions and am stuck on certain one I am having to create. I was wondering if someone could help me with this or lead me in the right…
Tom Gold
  • 1
  • 1
0
votes
0 answers

boot performs 95% confidential interval and outputs NaNs for original and bias for a column of data but normal for another two columns

can anyone help with this problem that I could not figure out? to perform 95% CI using boot but it outputs NaNs for original and bias for one column, e.g.fst_A_B_C_D$Fst_C_D, of the data. It works well with another two columns in the dataset. Below…
BeeBern
  • 1
  • 3
0
votes
0 answers

Writing Bootstrap approach with error: number of items to replace is not a multiple of replacement length

I am writing a bootstrap function to estimate the differences between two proportions; however, I have this error in my R code: number of items to replace is not a multiple of replacement length. I have attached the code and could anyone please help…
0
votes
1 answer

Bootstrapped standard errors and p-value from weighted mann-whitney test

I would like to bootstrap the p-value and standard errors from weighted Mann-Whitney U test. I can run the test as: weighted_mannwhitney(c12hour ~ c161sex + weight, efc) which works fine, but am not entirely sure how I can run a bootstrapped version…
microbe
  • 74
  • 6
0
votes
1 answer

computing proportions in R and plotting them, correcting code mistakes

df$age = as.numeric(df$age) library(dplyr) df%>% group_by(level) %>% summarise(mean_age = mean(age), na.rm = TRUE) As i got an NA on N7 that i couldn't seem to get rid of, i tried something else : df_mean <- aggregate(x = df$age, …
0
votes
2 answers

Generate sample data according to a suposed proportion

I am working on a project where products in production have a defect, but in very rare cases. For example 1/1,000,000 products have a defect. How could I generate data, in R, Python, or Excel, that would represent samples from this distribution ?
Tryzis
  • 41
  • 3
0
votes
1 answer

obtain bootstrap correlation interval between one variable and all others in data frame

I would like to get bootstrap correlation parameters (interval and rho) between one variable (v) and all others (x,y,z). If I was only interested in the correlation itself (not boostrap) I would use the following formula. df[,-1] %>%…
Garn_R
  • 77
  • 5
0
votes
1 answer

Covid-19 Growth rate (Bootstrapping/Time Series)

I am trying to code R in order to obtain growth rate for COVID-19. The equation can be found on the inserted image where i(t) is the number of infected individuals at time t. I think I could code if the equation were to be simple growth rate i(t) =…
0
votes
1 answer

Why am I getting similar CIs with so different sample sizes?

I just learned how to do bootstrap in R, and I'm excited. I was playing with some data, and found that, doesn't matter how many bootstrap samples I take, the CIs seem to be always around the same. I believe that, the more samples, the more narrow…