2

I am trying to develop a mixed effects model on a data set with repeated measures.

Met is measured on a series of randomly selected days on 24 samples submitted to 3 treatments (Treat, with levels c, uc and ga).

The levels of Met change due to differences in weather conditions during the days (Date). Date thus becomes a second random effect of the model (along with the items sampled (ID)).

My main interest is to see whether Treat has a significant effect on Met across days.

some sample data:

# create example data frame 
ID     <-  factor(rep(c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x"), 6))
Treat  <-  factor(rep(c(rep("c",8), rep("uc",8), rep("ga",8)), 6))
Date   <-  factor(rep(c(rep("10/06/2007",24), rep("19/06/2007",24), rep("12/07/2007",24), rep("21/07/2007",24), rep("11/08/2007",24), rep("12/08/2007",24)), 1))
Met    <-  as.numeric(c(rnorm(8,5,2),   rnorm(8,7,2),   rnorm(8,9,2), 
                        rnorm(8,15,2),  rnorm(8,17,2),  rnorm(8,19,2),
                        rnorm(8,9,2),   rnorm(8,11,2),  rnorm(8,13,2),
                        rnorm(8,8,2),   rnorm(8,10,2),  rnorm(8,12,2),
                        rnorm(8,2,2),   rnorm(8,4,2),   rnorm(8,6,2),
                        rnorm(8,3,2),   rnorm(8,5,2),   rnorm(8,7,2)))
ww     <-  gl(1,1,144)

lys.data  <-  data.frame(ID, Treat, Date, Met, ww)
head(lys.data)

# set contrasts of data frame
lys.data$Treat   <-  factor(lys.data$Treat,     levels=c("c", "uc", "ga"))

Then the analysis:

library(nlme)
lme.001  <-  lme(Met ~ Treat, data = lys.data,
                 random=list(ww=pdBlocked(list(pdIdent(~Date-1),
                             pdIdent(~ID-1)))))
summary(lme.001)

From the results I get it seems that I am not doing what I assume I am doing as the degrees of freedom seem incorrect (way too high). Is it correct that the number of denominator degrees of freedom increase with the number of repetitions (dates) that the experiment has been performed?

Who can help me out here or point me into the right direction? Am I going wrong with the way the I am representing the nesting of the data? (I assume there is none).

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
thijs van den bergh
  • 608
  • 1
  • 9
  • 17
  • This question may be more at home on stackoverflow, since it is only a question about how to use `lme`. – Macro Jan 09 '13 at 15:38
  • 2
    I think it is special enough that nobody at SO will have a clue except possibly whuber if he is there :). – StasK Jan 09 '13 at 16:49
  • 1
    The website http://stats.stackexchange.com was a better place for this question. – Sven Hohenstein Jan 10 '13 at 09:36
  • 1
    thats where my question got moved from ... – thijs van den bergh Jan 10 '13 at 10:04
  • Can you explain what led you to use that random structure instead of `random=~Date|ID`? Was there something in the residuals or the `lmList` plot that suggested this particular expression for the random effects? – bokov Jul 29 '13 at 13:48

1 Answers1

3

The rules that lme uses to compute denominator degrees of freedom are described on p. 91 of Pinheiro and Bates (2000) -- this page happens to be available on Google Books. (That link is also available on the GLMM faq page.)

update: since this no longer seems to be available in a useful form on Google Books, here's the text of the critical paragraphs:

These conditional tests for fixed-effects terms require denominator degrees of freedom. In the case of the conditional $F$-tests, the numerator degrees of freedom are also required, being determined by the term itself. The denominator degrees of freedom are determined by the grouping level at which the term is estimated. A term is called inner relative to a factor if its value can change within a given level of the grouping factor. A term is outer to a grouping factor if its value does not changes within levels of the grouping factor. A term is said to be estimated at level $i$, if it is inner to the $i-1$st grouping factor and outer to the $i$th grouping factor. For example, the term Machine in the fm2Machine model is outer to Machine %in% Worker and iner to Worker, so it is estimated at level 2 (Machine %in% Worker). If a term is inner to all $Q$ grouping factors in a model, it is estimated at the level of the within-group errors, which we denote as the $Q+1$st level.

The intercept, which is the parameter corresponding to the column of all 1's in the model matrices $X_i$, is treated differently from all the other parameters, when it is present. As a parameter it is regarded as being estimated at level 0 because it is outer to all the grouping factors. However, its denominator degrees of freedom are calculated as if it were estimated at level $Q+1$. This is because the intercept is the one parameter that pools information from all the observations at a level even when the corresponding column in $X_i$ doesn't change with the level.

Letting $m_i$ denote the total number of groups in level $i$ (with the convention that $m_0=1$ when the fixed effects model includes an intercept and 0 otherwise, and $m_{Q+1}=N$) and $p_i$ denote the sum of the degrees of freedom corresponding to the terms estimated at level $i$, the $i$th level denominator degrees of freedom is defined as

$$ denDF_i = m_i - (m_{i-1} + p_i), i = 1, \dots, Q $$

This definition coincides with the classical decomposition of degrees of freedom in balanced, multilevel ANOVA designs and gives a reasonable approximation for more general mixed-effects models.

I haven't checked out your example in great detail, but I strongly suspect that the issue is that you have a randomized block design rather than a strictly nested design, so that your degrees of freedom are higher than you think. In general the residual/denominator df are (number of blocks-1)*(number per block-1), rather than (as you may have been expecting) the (number of blocks-1) typical of a nested design: see here, for example.

On the other hand, it's possible that lme got it wrong, if the design is sufficiently complex -- in which case you may have to work it out for yourself, or a simple solution may not exist. Again, see the GLMM faq for advice.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453