1

I seem to misunderstand how the dwtest of the lmtest package in R works.

> r1 <- runif(1000, 0.0, 1.0)
> r2 <- runif(1000, 0.0, 1.0)
> dwtest(lm(r2 ~ r1 ))

    Durbin-Watson test

data:  lm(r2 ~ r1)
DW = 1.9806, p-value = 0.3789
alternative hypothesis: true autocorrelation is greater than 0

> 
> r1 <- seq(0, 1000, by=1)
> r2 <- seq(0, 1000, by=1)
> dwtest(lm(r2 ~ r1 ))

    Durbin-Watson test

data:  lm(r2 ~ r1)
DW = 2.2123, p-value = 0.8352

When i understand everything right, I first test 2 sets of random numbers with each other (which do not correlate - correct)

Then i correlate the numbers from 1 to 1000 incrementing with them self (which does not correlate - uhm... what)

Can someone point me to the obvious error i do?

merv
  • 67,214
  • 13
  • 180
  • 245
BadPractice
  • 13
  • 1
  • 3
  • 1
    I cannot reproduce your second example: pasting commands as you've typed gives me p-value < 2.2e-16. – PSzczesny Mar 18 '18 at 17:37
  • In your second example, I get actually get a higher p value than OP: 0.9657. OP, could you please be specific about what library you are getting the dwtest function from? – user54038 Mar 18 '18 at 19:04
  • 1
    Actually, I think @merv's suggestion is more likely: we're getting weird results in this case because the residuals should really be zero, but thanks to slight rounding errors, they are not. So the test is really being performed on floating point artifacts, which may be machine-dependent. Maybe we should really be adding some small amount of random noise: r2 <- r1 + rnorm(1000, sd=.01), or something. – user54038 Mar 18 '18 at 19:21
  • 1
    @user54038 I think more to the point is that one always needs to have both effect size and statistical significance in mind when performing hypothesis tests. Here the effect size (of the residuals) is effectively zero so one wouldn't even both with testing for structure in the (artifactual) residuals because it's meaningless. If the OP had also done `anova()` or `summary()` on the model, the warning would have come up that the model fits perfectly, i.e., residuals don't matter. This also usually means some is totally wrong / artificial about the model. – merv Mar 18 '18 at 19:28

1 Answers1

1

Looking on Wikipedia, it seems like the Durbin-Watson test is for autocorrelation of residuals, not for correlation. So, if I define r2 <- r1 + sin(r1), then I get a significant result from the DW test:

> r1 <- seq(0, 1000, by=1)
> r2 <- r1 + sin(r1)
> dwtest(lm(r2 ~ r1))
    Durbin-Watson test
data:  lm(r2 ~ r1)
DW = 0.91956, p-value < 2.2e-16
alternative hypothesis: true autocorrelation is greater than 0

Here's the reason. The value of r2[i], predicted from the linear model, is r1[i]. The "residual", which is the difference between the actual and predicted values, is r2[i]-r1[i]. If this is above zero, then r2[i+1]-r1[i+1] is probably also above zero, since they are neighboring values of the sine function. Therefore, there's "autocorrelation" in the residuals, meaning correlation between neighboring values.

user54038
  • 336
  • 1
  • 9