0

I am having a question concerning the function coeftest(). I am trying to figure out where from I could get any results of the R-squared of this function. I was fitting a standard ,multiple linear regression as follows:

Wetterstation.lm <- lm(temp~t+temp_auto+dum.jan+dum.feb+dum.mar+dum.apr+dum.may+dum.jun+dum.aug+dum.sep+dum.oct+dum.nov+dum.dec+
                      dum.jan*t+dum.feb*t+dum.mar*t+dum.apr*t+dum.may*t+dum.jun*t+dum.aug*t+dum.sep*t+dum.oct*t+dum.nov*t+dum.dec*t)

Upfront I defined each of these variables separately and my results were the following:


> summary(Wetterstation.lm)

Call:
lm(formula = temp ~ t + temp_auto + dum.jan + dum.feb + dum.mar + 
    dum.apr + dum.may + dum.jun + dum.aug + dum.sep + dum.oct + 
    dum.nov + dum.dec + dum.jan * t + dum.feb * t + dum.mar * 
    t + dum.apr * t + dum.may * t + dum.jun * t + dum.aug * t + 
    dum.sep * t + dum.oct * t + dum.nov * t + dum.dec * t)

Residuals:
     Min       1Q   Median       3Q      Max 
-10.9564  -1.3214   0.0731   1.3621   9.9312 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.236e+00  9.597e-02  33.714  < 2e-16 ***
t            1.206e-05  3.744e-06   3.221  0.00128 ** 
temp_auto    8.333e-01  2.929e-03 284.503  < 2e-16 ***
dum.jan     -3.550e+00  1.252e-01 -28.360  < 2e-16 ***
dum.feb     -3.191e+00  1.258e-01 -25.367  < 2e-16 ***
dum.mar     -2.374e+00  1.181e-01 -20.105  < 2e-16 ***
dum.apr     -1.582e+00  1.142e-01 -13.851  < 2e-16 ***
dum.may     -7.528e-01  1.106e-01  -6.809 9.99e-12 ***
dum.jun     -3.283e-01  1.106e-01  -2.968  0.00300 ** 
dum.aug     -2.144e-01  1.094e-01  -1.960  0.05005 .  
dum.sep     -8.009e-01  1.103e-01  -7.260 3.96e-13 ***
dum.oct     -1.752e+00  1.123e-01 -15.596  < 2e-16 ***
dum.nov     -2.622e+00  1.181e-01 -22.198  < 2e-16 ***
dum.dec     -3.287e+00  1.226e-01 -26.808  < 2e-16 ***
t:dum.jan    2.626e-06  5.277e-06   0.498  0.61877    
t:dum.feb    2.479e-06  5.404e-06   0.459  0.64642    
t:dum.mar    1.671e-06  5.277e-06   0.317  0.75145    
t:dum.apr    1.357e-06  5.320e-06   0.255  0.79872    
t:dum.may   -3.173e-06  5.276e-06  -0.601  0.54756    
t:dum.jun    2.481e-06  5.320e-06   0.466  0.64098    
t:dum.aug    5.998e-07  5.298e-06   0.113  0.90985    
t:dum.sep   -5.997e-06  5.321e-06  -1.127  0.25968    
t:dum.oct   -5.860e-06  5.277e-06  -1.110  0.26683    
t:dum.nov   -4.215e-06  5.320e-06  -0.792  0.42820    
t:dum.dec   -2.526e-06  5.277e-06  -0.479  0.63217    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.12 on 35744 degrees of freedom
Multiple R-squared:  0.9348,    Adjusted R-squared:  0.9348 
F-statistic: 2.136e+04 on 24 and 35744 DF,  p-value: < 2.2e-16

Now I was trying to adjust for heteroskedasticity and autocorrelation using the function coeftest() and vcovHAC as follows:

library("lmtest")
library("sandwich")
Wetterstation.lm.HAC <- coeftest(Wetterstation.lm, vcov = vcovHAC)

The results of these are that:

> Wetterstation.lm.HAC

t test of coefficients:

               Estimate  Std. Error  t value  Pr(>|t|)    
(Intercept)  3.2356e+00  7.8816e-02  41.0529 < 2.2e-16 ***
t            1.2059e-05  2.7864e-06   4.3280 1.509e-05 ***
temp_auto    8.3334e-01  2.9798e-03 279.6659 < 2.2e-16 ***
dum.jan     -3.5505e+00  1.1843e-01 -29.9789 < 2.2e-16 ***
dum.feb     -3.1909e+00  1.2296e-01 -25.9507 < 2.2e-16 ***
dum.mar     -2.3741e+00  1.0890e-01 -21.8002 < 2.2e-16 ***
dum.apr     -1.5821e+00  9.5952e-02 -16.4881 < 2.2e-16 ***
dum.may     -7.5282e-01  8.8987e-02  -8.4599 < 2.2e-16 ***
dum.jun     -3.2826e-01  8.2271e-02  -3.9899 6.622e-05 ***
dum.aug     -2.1440e-01  7.7966e-02  -2.7499  0.005964 ** 
dum.sep     -8.0094e-01  8.4456e-02  -9.4835 < 2.2e-16 ***
dum.oct     -1.7519e+00  9.2919e-02 -18.8538 < 2.2e-16 ***
dum.nov     -2.6224e+00  1.0028e-01 -26.1504 < 2.2e-16 ***
dum.dec     -3.2873e+00  1.1393e-01 -28.8546 < 2.2e-16 ***
t:dum.jan    2.6256e-06  5.2429e-06   0.5008  0.616517    
t:dum.feb    2.4790e-06  5.5284e-06   0.4484  0.653850    
t:dum.mar    1.6713e-06  4.8632e-06   0.3437  0.731107    
t:dum.apr    1.3567e-06  4.5670e-06   0.2971  0.766423    
t:dum.may   -3.1734e-06  4.2970e-06  -0.7385  0.460209    
t:dum.jun    2.4809e-06  4.1490e-06   0.5979  0.549880    
t:dum.aug    5.9983e-07  4.0379e-06   0.1485  0.881910    
t:dum.sep   -5.9975e-06  4.1675e-06  -1.4391  0.150125    
t:dum.oct   -5.8595e-06  4.4635e-06  -1.3128  0.189265    
t:dum.nov   -4.2151e-06  4.6555e-06  -0.9054  0.365263    
t:dum.dec   -2.5257e-06  4.9871e-06  -0.5065  0.612539    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

But as I want to add the R-squared in a table that summarizes my results I cannot figure out how to get it. Now I was wondering if there is anyone that could help with this issue and tell me where I could get the information from. Maybe I am just to dumb but as I am quite new to R I would be happy for any help I could get.

Thank you very much in advance.

lunar_props
  • 118
  • 3
Leon
  • 37
  • 1
  • 8
  • 1
    `coeftest()` is from the lmtest package? – mischva11 Nov 25 '19 at 23:18
  • As I further dealt with the topic I figured out that HAC will only adjust the standard errors and does not change the way the coefficients are estimated. Is that correct or am I just misled? And if this would be true the R^2 would not change as the Residuals stay the same. Wouldn't that be true? And if so isn't it strange that coeftests shows different parameter estimates? I am totally confused now. – Leon Nov 25 '19 at 23:19
  • @mischva11 yes it is from the package "lmtest" – Leon Nov 25 '19 at 23:21

1 Answers1

2

Simple answer: no there is not. And also there is no reason for doing this. The coeftest() function is using the values of your given model. With stats4::coef the coeftest function is taking the coefficients of the model.

It would be possible to extract the r^2 value if the function intends to do it. Also the imtest coeftest() only returns a paste, so you can't extract values.

To sum this up: lmtest::coeftest() is using the values of lm and so the r^2 is not changing


Background about the standard error: lm uses a slightly different method to calculate the standard error. In the source code you can extract:

  R <- chol2inv(Qr$qr[p1, p1, drop = FALSE])
  se <- sqrt(diag(R))

So this brings us to following: lm using the Cholesky composition (i also think R using this by default).

coeftest() meanwhile using the sqrt() of the variance-covariance matrix of the residuals(see here). So the autocorrelation. vcov extracts the variance-covriance matrix of a given model (like lm)

se <- vcov.
se <- sqrt(diag(se))

I personally think the users of lm are using the chalesky composition for speed reasons, since you don't have to invert the matrix. But the writers of the lmtest package were no aware of this. But this is just a guess.

Since the t values are calculated in both packages with the estimated values and the standard error like this:

tval <- as.vector(est)/se

and the p-value is based on the tval:

2*pt(abs(tval), rdf, lower.tail = FALSE)

all the differences are based on the different estimated error. Please be aware, that the estimations are identical because coeftest() just inherits them.

mischva11
  • 2,811
  • 3
  • 18
  • 34
  • When you are referring to the coef() function, do you mean coeftest()? So to sum this up does it mean that the only thing that changes is the standard error estimation? And if so why does the coeftest() function return other parameter estimates? – Leon Nov 26 '19 at 09:22
  • @Leon updated the answer, sorry for the unconvience, yes i meant `coeftest`. – mischva11 Nov 26 '19 at 12:50