0

I want to get the equation of the linear model for the following experiment mat in latin square.

data <- c(12.5,11,13,11.4)
row <- factor(rep(1:2,2))
col <- factor(rep(1:2,each=2))
car <- c("B","A","A","B")
mat <- data.frame(row,col,car,data)
mat
# row col car data
# 1   1   1   B 12.5
# 2   2   1   A 11.0
# 3   1   2   A 13.0
# 4   2   2   B 11.4
Thomas
  • 43,637
  • 12
  • 109
  • 140
user_012314112
  • 324
  • 1
  • 2
  • 10

2 Answers2

1
fit <- lm(data~row+col+car,mat)
coef(fit)
# (Intercept)        row2        col2        carB 
#       12.55       -1.55        0.45       -0.05 

So the effect of the row factor is -1.55, the effect of the col factor is 0.45, and the effect of the car factor is -0.05. The intercept term is the value of data expected when al the factors are at the first level (row=1, col=1, car=A).

Notice that your design is over-specified: you have only 4 pieces of data, which is enough to specify the effects of two factors and their interaction, but you have set it up so that car is the interaction. So there are no degrees of freedom left for error.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • if I increase the size of the square, then I'll row2, row3, ... rowN . Will be all coefficients in the model ? And how will be your interpretation? Thanks. – user_012314112 Nov 01 '14 at 22:51
1

I might recommend using a mixed model approach to this.

mat <- data.frame(data=c(12.5,11,13,11.4),
                  row=factor(rep(1:2,2)),
                  col=factor(rep(1:2,each=2)),
                  car=c("B","A","A","B"))

I'm using lmerTest because it will more easily provide you with (approximate) p-values

By default anova() uses the Satterthwaite approximation, or you can tell it to use the more accurate Kenward-Roger approximation. In either case you can see that the denominator df are exactly, or nearly zero, and the p-value is either missing or very close to 1, indicating that your model doesn't make sense (i.e. even using the mixed model it's overparameterized).

library("lmerTest")
anova(m1 <- lmer(data~car+(1|row)+(1|col),data=mat))
anova(m1,ddf="Kenward-Roger")
##     Sum Sq Mean Sq NumDF      DenDF F.value Pr(>F)
## car 0.0025  0.0025     1 9.6578e-06  2.0019 0.9999

Try for a bigger design:

set.seed(101)
mat2 <- data.frame(data=rnorm(36),
                  row=gl(6,6),
                  col=gl(6,1,36),
                  car=sample(LETTERS[1:2],size=36,replace=TRUE))
m2A <- lm(data~car+row+col,data=mat2)
anova(m2A)
## (excerpt)
##           Df  Sum Sq Mean Sq F value Pr(>F)
## car        1  1.2571 1.25709  1.6515  0.211

m2B <- lmer(data~car+(1|row)+(1|col),data=mat2)
anova(m2B)
##     Sum Sq Mean Sq NumDF  DenDF F.value Pr(>F)
## car  1.178   1.178     1 17.098    1.56 0.2285
anova(m2B,ddf="Kenward-Roger")
##     Sum Sq Mean Sq NumDF  DenDF F.value Pr(>F)
## car  1.178   1.178     1 17.005  1.1029 0.3083

It surprises me a little bit that the lm and lmerTest answers are so far apart here -- I would have thought this was an example where there was a well-formulated "classic" answer -- but I'm not sure. Might be worth following up on CrossValidated or Google.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453