NLME Package documentation bug: anova.gls() and anova.lme() sums of squares

Question

The documentation for nlme's functions anova.lme() and anova.gls() clearly states:

"When only one fitted model object is present, a data frame with the sums of squares,..." is being returned.

When running the example below, no sums of squares are printed:

library(nlme)
fm1 <- lme(distance ~ age, Orthodont, random = ~ 1 | Subject)
anova(fm1)

#              numDF denDF   F-value p-value
#  (Intercept)     1    80 3126.1443  <.0001
#  age             1    80  114.8383  <.0001

This looks like a bug to me. I believe it should be reported on the R bugzilla page but I don't seem to have access there.

Running the same model using aov() results n the following:

summary(aov(distance ~ age + Error(Subject), Orthodont))

# Error: Subject
#           Df Sum Sq Mean Sq F value Pr(>F)
# Residuals 26  518.4   19.94               

# Error: Within
#           Df Sum Sq Mean Sq F value Pr(>F)    
# age        1  235.4  235.36   114.8 <2e-16 ***
# Residuals 80  164.0    2.05

This is a workaround for calculating the sums of squares for fixed effects and the residuals manually.

Based on @Drey 's comment, I tried to retrieve the sums of squares directly from the anova.lme() function. The source code can be seen by executing anova.lme. The relevant lines appear to be the one belows (for sequential and marginal sums of squares)

for (i in 1:nTerms) {
    nDF[i] <- length(assign[[i]])
    if (type == "sequential") {
      c0i <- c0[assign[[i]]]
    }
    else {
      c0i <- c(qr.qty(qr(vFix[, assign[[i]], drop = FALSE]), 
                      c0))[1:nDF[i]]
    }
    Fval[i] <- sum(c0i^2)/nDF[i]
    Pval[i] <- 1 - pf(Fval[i], nDF[i], dDF[i])

For the above model, c0i = 55.9 for the first iteration and 10.7 for the second.

sum(c0i^2) = 3126.1 for the first iteration (F-value of Intercept) and 114.8 for the second iteration (F-value for age).

The F-value for age can be calculated by dividing the Mean Sq of age (235.36) by the Mean Sq of the residuals (2.05), as shown in the table from aov(). anova.lme is doing something different here but I don't know what. Does someone know what's going on here and how to make it more consistent with the aov() implementation?

If you read carefully, the first sentence in the documentation is missing a verb. It is not *"being returned"*. It actually just shows the summary and I think the missing verb should be about what is printed in the console output. Btw, you can access the residuals via `fm1$residuals` or `residuals(fm1)`. — Drey, Oct 12 '16 at 11:00
@ Drey :The verb is added in the next sentence: "Otherwhise,... is returned". Whether it is printed to the console or assigned to a data.frame: all the values from the documentation, except the mentioned sums of squares, are there. How does extracting the residuals help here? The above mentioned functions can use sequential or marginal sums of squares (type argument) and I need to extract or recalculate them. — Konn, Oct 12 '16 at 11:18
Maybe I misunderstood you - I was thinking about sum of squared residuals. But what are you trying to extract? — Drey, Oct 12 '16 at 12:44
Anyhow, looking at the code of `anova.lme` "sum of squares" may be found around line 33 (for your example) where `Fval[i] <- sum(c0i^2)/nDF[i]` it is stored in the returned F-Value. — Drey, Oct 12 '16 at 13:03
@Drey , That's a very helpful comment. I looked into it and modified the original question to account for it. I now found a workaround, using the aov() function but this does not appear to be a good solution since more complex random effect structures can be specified using the nlme package. — Konn, Oct 12 '16 at 14:57
@Drey I hope the question now shows what I am trying to extract. The sums of squares ("Sum Sq") that are usually reported by the anova function for linear models. They are the only thing that is missing in the anova table reported by the nlme package and happen to be the values of interest for me. — Konn, Oct 12 '16 at 15:19
Dear Konn, help me to understand which sums of sq. you are looking for. I don't understand what the "usual SS" are. In `aov`/`anova` the "Sum Sq" refers to the the squared effects which in `lm` for example are computed from the QR decomposion $t(Q) %*% y$ i.e. `qr.qty(qr(cbind(1, Orthodont$age)), Orthodont$distance)[2]^2` ([2] refering to the age effect). From `lme` you can extract fixed and random effects by `random.effects(...)` and `fixed.effects`- but these are just coefs. of your model. However, `lme` computes `c0` using the same QR decomposition code. Hence "Sum Sq" are in `c0`, imho — Drey, Oct 12 '16 at 17:00
@Drey, Thanks a lot. I try to find the `lm` "Sum Sq" in `anova.lme`. Your QR decomposition code computes them correctly. Adding `print(c0)` at the end of the `anova.lme` function returns 55.9 and 10.7 for the above model. This is not the same as "Sum Sq" from `lm`. I don't understand how these values relate to "Sum Sq". I could not find `C0` or `c0i` values in the `anova.lme` function that match the `lm` "Sum Sq". If you can demonstrate where a `print()` statement in `anova.lme` returns `c0` values that match the `lm` "Sum Sq" for the above model then I am very happy to accept it as an answer. — Konn, Oct 13 '16 at 08:45
why should be "Sum Sq" the same if you use different models ? `c0` contains the effects computed by the QR decomposition (similar as to `lm`) and if you square them then you will get the desired value - still I don't understand what sum of squares you are trying to compute. Anyhow, to get the `c0^2` simply multiply the `F-value` by `numDF` from the `aov` result. That is what i wrote in the 4th comment: "sum of squares" are in the F-values. — Drey, Oct 13 '16 at 09:09
I try to find the exact "Sum Sq" that are reported by `aov()`: 235.4 for age and 164 for residuals. I am using only 1 model: `lme(distance ~ age, Orthodont, random = ~ 1 | Subject)` is the same as `aov(distance ~ age + Error(Subject), Orthodont)`. I showed in the original question that the results are identical except the missing "Sum Sq" from `nlme`. Squaring `c0` results in `F-value`, not the desired "SumSq", as shown in the question. Multiplying `F-value` by `numDF` does not compute the desired "SumSq"neither . F-value for age = 114. numDF = 1. --> 114 * 1 is not 325 ("SumSq" for age). — Konn, Oct 13 '16 at 09:40

NLME Package documentation bug: anova.gls() and anova.lme() sums of squares

0 Answers0