0

Suppose I want to assess the goodness of a linear model before and after leaving out a covariate, and I want to implement some kind of bootstrapping.

I tried to bootstrap the sum of residuals of both models and then I applied the Kolmogorov-Smirnov test to assess if the two are the same distributions.

The minimal working code:

lm.statistic.resid <- function(data,i){
    d<-data[i,]

    r.gressor <- colnames(data)[1]
    c.variates <- colnames(data)[-1]

    lm.boot <- lm(data=d)

    out <- sum(resid(lm.boot))

    return(out)
}

df.restricted <- mtcars[ , names(mtcars) != c("wt")]

classical.lm  <- lm(mtcars)
restricted.lm  <- lm(df.restricted)

boot.regression.full = boot(df,
                        statistic=lm.statistic.resid,
                        R=1000)

boot.regression.restricted = boot(df.restricted,
                        statistic=lm.statistic.resid,
                        R=1000)
x <- boot.regression.restricted$t
y <- boot.regression.full$t

ks.test(x,y)

However, I get kind of the same result both in removing wt (which statistically significant) and am (which is not).

I should expect a smaller p-value in case I remove wt.

Marco Repetto
  • 336
  • 2
  • 15
  • why would you expect an overall smaller p-value, the R2 is higher with all the variables included. You can can compare models using an anova, eg. `anova(restricted.lm, classical.lm)`. The distributions you are comparing aren't significantly different than 0, as all those numbers are below floating point precision levels – Rorschach Jun 15 '19 at 22:17
  • In using an anova on the model as you suggested, I'm not bootstrapping anything, whereas what I want to do is using bootstrap techniques to asses the goodness of both models – Marco Repetto Jun 16 '19 at 18:07
  • 1
    well the sum of the residuals in your model will always be 0 (you are fitting with an intercept) so you need to use a different statistic, eg. the (root) mean squared error or whatever – Rorschach Jun 16 '19 at 18:14
  • You are right! I'm trying now with the Residual Sum of Squares instead of the residuals – Marco Repetto Jun 16 '19 at 18:25

0 Answers0