4

I am fitting a linear model in R with three variables like so

cube_mod <- lm(y ~ x + x_2 + x_3)

I then use the anova function to display the results of analysis of variance with and get the following table

anova(cube_mod)
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq  F value   Pr(>F)    
x          1     21      21   0.0083 0.928881    
x_2        1 658209  658209 254.2771 2.26e-10 ***
x_3        1  64967   64967  25.0977 0.000191 ***
Residuals 14  36240    2589                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The table shows the F-test for each variable separately, but I want the following table which shows only the F-test for the full model.

Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq  F value   Pr(>F)    
Model      3 723197  241066    93.13        0 
Residuals 14  36240    2589                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Is there a simple way to get this table from a linear model object?

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
spencergw
  • 157
  • 5
  • 1
    You are doing analysis ov variance so it will break it down as shown above. The F-test for the full model is given in the summary of the model. ie do `summary(cube_mode)` and you will have all the values you want to create the table. – Onyambu Jul 24 '23 at 14:25
  • There is also the `Anova` function from the [`car`](https://cran.r-project.org/web/packages/car/index.html) package : `car::Anova(cube_mod)` – user20650 Jul 24 '23 at 18:12

4 Answers4

4

1) Using the built-in anscombe data.frame

Model <- as.matrix(anscombe[6:8])
anova(lm(y1 ~ Model, anscombe))

giving:

Analysis of Variance Table

Response: y1
          Df Sum Sq Mean Sq F value  Pr(>F)  
Model      3 24.285  8.0948  3.3355 0.08577 .
Residuals  7 16.988  2.4269                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

2) or in terms of the lm object, fm, as discussed in the comments

fm <- lm(y1 ~ y2 + y3 + y4, anscombe)

Model <- model.matrix(fm)
anova(update(fm, . ~ Model))
# same output as above

3) Another approach is to ose aov1 from sasLM:

library(sasLM)
aov1(y1 ~ y2 + y3 + y4, anscombe)[c(1, 5), ]
##           Df   Sum Sq  Mean Sq  F value     Pr(>F)
## MODEL      3 24.28449 8.094828 3.335479 0.08576858
## RESIDUALS  7 16.98821 2.426887       NA         NA

Update

Added approach using fm, simplified it a bit and switched model to use y2, y3 and y4 as independent variables since x1, x2 and x3 are all the same in anscombe. Also added solution using sasLM package.

anscombe
##    x1 x2 x3 x4    y1   y2    y3    y4
## 1  10 10 10  8  8.04 9.14  7.46  6.58
## 2   8  8  8  8  6.95 8.14  6.77  5.76
## 3  13 13 13  8  7.58 8.74 12.74  7.71
## 4   9  9  9  8  8.81 8.77  7.11  8.84
## 5  11 11 11  8  8.33 9.26  7.81  8.47
## 6  14 14 14  8  9.96 8.10  8.84  7.04
## 7   6  6  6  8  7.24 6.13  6.08  5.25
## 8   4  4  4 19  4.26 3.10  5.39 12.50
## 9  12 12 12  8 10.84 9.13  8.15  5.56
## 10  7  7  7  8  4.82 7.26  6.42  7.91
## 11  5  5  5  8  5.68 4.74  5.73  6.89
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • +1 This is really neat! ```Model <- model.matrix(formula, data)``` might be better since it can easily incorporates factors. – one Jul 24 '23 at 14:40
  • 1
    This is great. Though note that OP states *Is there a simple way to get this table from a linear model object?* Meaning they already have the linear object. How to get the Anova from the object they already have is the issue – Onyambu Jul 24 '23 at 14:40
3

I use the example data in @G.Grothendieck's answer.


You can compare the model with an intercept-only null model within anova().

Model <- lm(y1 ~ y2 + y3 + y4, anscombe)
anova(update(Model, . ~ 1), Model)

# Analysis of Variance Table
# 
# Model 1: y1 ~ 1
# Model 2: y1 ~ y2 + y3 + y4
#   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
# 1     10 41.273                              
# 2      7 16.988  3    24.285 3.3355 0.08577 .

It shows the same statistics of F-test as in summary(Model) and anova(Model).

summary(Model)
# ...skip
# F-statistic: 3.335 on 3 and 7 DF,  p-value: 0.08577

anova(Model)
# Response: y1
#           Df  Sum Sq Mean Sq F value  Pr(>F)  
# y2         1 23.2162 23.2162  9.5663 0.01749 *
# y3         1  0.0487  0.0487  0.0200 0.89139  
# y4         1  1.0196  1.0196  0.4201 0.53755  
# Residuals  7 16.9882  2.4269
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
1

Try supernova function from R package supernova something like

library(supernova)
supernova(lm(mpg ~ disp + cyl, data = mtcars))
Analysis of Variance Table (Type III SS)
 Model: mpg ~ disp + cyl

                               SS df      MS      F   PRE     p
 ----- --------------- | -------- -- ------- ------ ----- -----
 Model (error reduced) |  855.307  2 427.653 45.808 .7596 .0000
  disp                 |   37.594  1  37.594  4.027 .1219 .0542
   cyl                 |   46.418  1  46.418  4.972 .1464 .0337
 Error (from model)    |  270.740 29   9.336                   
 ----- --------------- | -------- -- ------- ------ ----- -----
 Total (empty model)   | 1126.047 31  36.324  
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
0

You can manually calculate it:


fit <- lm(mpg ~ wt + qsec+as.factor(cyl), mtcars)
temp <- anova(fit)

out <- temp
n <- nrow(temp)
out$Df <- with(temp,c(sum(Df[1:(n-1)]),Df[n],rep(NA_real_,n-2)))
out$`Sum Sq` <- with(temp,c(sum(`Sum Sq`[1:(n-1)]),`Sum Sq`[n],rep(NA_real_,n-2)))
out$`Mean Sq` <- with(out,out$`Sum Sq`/out$Df)
out$`F value` <- c(out$`Mean Sq`[1]/out$`Mean Sq`[2],rep(NA_real_,n-1))
out$`Pr(>F)` <- c(pf(out$`F value`[1],out$Df[1],out$Df[2],lower.tail = FALSE),rep(NA_real_,n-1))
out <- out[1:2,]
rownames(out) <- c("Model","Residuals")
out

Analysis of Variance Table

Response: mpg
          Df Sum Sq Mean Sq F value    Pr(>F)    
Model      4 953.94 238.484  37.413 1.208e-10 ***
Residuals 27 172.11   6.374                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
                
one
  • 3,121
  • 1
  • 4
  • 24