8

I just started to learn R and need some help on finding the mean and median of residuals for my data. I calculated the lm and in the summary I get residuals like follows:

min       1Q        median  3Q      Max
-111.86   -34.90     -7.6   33.46   182.58

Question: so the median of residuals is -7.6 but which is my mean? Or is there a calculation for finding mean and median of residuals? I was going to do mean(resid(trees.lm) or should it be entered as mean(trees.lm$resid)

Please clarify because my classmates all get different responses for the same data set.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453

1 Answers1

7

The answer to the one specific question here is:

mean(resid(trees.lm))

You shouldn't delve into fitted model objects like this and strip out arbitrary components. Doing so on something a bit more complicated like a GLM will bite your hand off when you realise you just extracted the working residuals via:

glm.mod.obj$residuals

which are unlikely to be useful to you.

Even for simple things like lm() objects, what using resid() or accessing $residuals can be different depending upon how the model was fitted (what was the setting for the na.action argument for example?).

Also, the linear model assumes that the residuals are i.i.d. Gaussian (or normal) random variables with mean 0 and variance $\hat{\sigma}^2}$ so the mean should be very close to 0 (i.e. very, very, very close to 0 but not exactly because this is a computer and floating point arithmetic is in play).

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • 2
    Regardless of how the model fits the mean of the residuals will be close to zero. This should never be surprising in OLS. – assumednormal Sep 17 '12 at 15:35
  • Oops, had something else in my mind when I was writing that and you are quite right. Editing out my stupidity. – Gavin Simpson Sep 17 '12 at 15:39
  • In OLS the sum of the residuals is exactly equal to zero. It is a property of minimizing the squared residuals. – Michael R. Chernick Sep 17 '12 at 16:24
  • 2
    @MichaelChernick, You're right, except that `R` typically reports a mean of `1e-14` or something similar. That's why I left my comment as "close to zero" rather than "exactly zero". – assumednormal Sep 17 '12 at 18:38
  • 1
    Indeed @Max and hence why I left in the close to zero bit when I made my edit. Mathematically it should be 0 but the computer will report it as something almost zero, which might confuse people if they are not aware of this. – Gavin Simpson Sep 17 '12 at 19:23