1

I am trying to create sample of 200 linear model coefficients using a loop in R. As an end result, I want a vector containing the coefficients.

for (i in 1:200) {        
  smpl_5 <- population[sample(1:1000, 5), ]
  model_5 <- summary(lm(y~x, data=smpl_5))         
}

I can extract the coefficients easy enough, but I am having trouble outputting them into a vector within the loop. Any Suggestions?

jbaums
  • 27,115
  • 5
  • 79
  • 119
Kampfsca
  • 13
  • 1
  • 3
  • See the help file `?coef` for extracting coefficients. Assuming you are going to be storing more than one coefficient from each model, you will either need to store them in a `list` (recommended) or multiple vectors (one for each coefficient). You will also want to initialize your object to the correct length before entering the loop (e.g. `coefList <- list(NULL); length(coefList) <- 200`). – nrussell Oct 23 '14 at 15:11
  • @nrussell - since all models have the same structure (i.e. an intercept and the coefficient for `x` are estimated), the coefficients matrices will all be of identical dimension, so an array is also suitable. – jbaums Oct 23 '14 at 15:28
  • @jbaums Agreed, I was just throwing out the first couple of suggestions that came to mind which would require little modification to the OP's code. – nrussell Oct 23 '14 at 15:34

1 Answers1

1

You can use replicate for this if you like. In your case, because the number of coefficients is identical for all models, it'll return an array as shown in the example below:

d <- data.frame(x=runif(1000))
d$y <- d$x * 0.123 + rnorm(1000, 0, 0.01)
coefs <- replicate(3, {
  xy <- d[sample(nrow(d), 100), ]
  coef(summary(lm(y~x, data=xy)))
})

coefs

# , , 1
# 
#                Estimate  Std. Error    t value     Pr(>|t|)
# (Intercept) 0.001361961 0.002091297  0.6512516 5.164083e-01
# x           0.121142447 0.003624717 33.4212114 2.235307e-55
# 
# , , 2
# 
#                Estimate  Std. Error  t value     Pr(>|t|)
# (Intercept) 0.003213314 0.001967050  1.63357 1.055579e-01
# x           0.118026828 0.003332906 35.41259 1.182027e-57
# 
# , , 3
# 
#                Estimate  Std. Error   t value     Pr(>|t|)
# (Intercept) 0.003366678 0.001990226  1.691606 9.389883e-02
# x           0.119408470 0.003370190 35.430783 1.128070e-57

Access particular elements with normal array indexing, e.g.:

coefs[, , 1] # return the coefs for the first model

#                Estimate  Std. Error    t value     Pr(>|t|)
# (Intercept) 0.001361961 0.002091297  0.6512516 5.164083e-01
# x           0.121142447 0.003624717 33.4212114 2.235307e-55

So, for your problem, you could use:

replicate(200, { 
  smpl_5 <- population[sample(1:1000, 5), ]
  coef(summary(lm(y~x, data=smpl_5))) 
})
jbaums
  • 27,115
  • 5
  • 79
  • 119