lmPerm::lmp(y~xf,center=TRUE) vs lm(y~xf): very different coefficients

Question

While

lmp(y~x, center=TRUE,perm="Prob")
lm(y~x)

gives a similar result for x and y being quantitative variables,

lmp(y~x*f, center=TRUE,perm="Prob")
lm(y~x*f)

differs where f is a factor variable.

require(lmPerm)
## Test data
x <- 1:1000
set.seed(1000)
y1 <- x*2+runif(1000,-100,100)
y1 <- y1+min(y1)
y2 <- 0.75*y1 + abs(rnorm(1000,50,10))
datos <- data.frame(x =c(x,x),y=c(y1,y2),tipo=factor(c(rep("A",1000),rep("B",1000))))

Then as expected,

coefficients(lmp(y~x,perm="Prob",data=datos,center=FALSE))
# [1] "Settings:  unique SS "
# (Intercept)           x 
#   -37.69542     1.74498 

coefficients(lm(y~x,data=datos))
# (Intercept)           x 
#   -37.69542     1.74498

But

fit.lmp <- lmp(y~x*tipo,perm="Prob",data=datos,center=FALSE)
fit.lm  <- lm(y~x*tipo, data=datos)

coefficients(fit.lm)
# (Intercept)           x       tipoB     x:tipoB 
# -71.1696395   1.9933827  66.9484438  -0.4968049 

coefficients(fit.lmp)
# (Intercept)           x       tipo1     x:tipo1 
# -37.6954176   1.7449803 -33.4742219   0.2484024

I understand the coefficients from lm():

coefficients(fit.lm)[1:2] # coefficients for Level A
# (Intercept)           x 
# -71.169640    1.993383 

coefficients(fit.lm)[1:2] + coefficients(fit.lm)[3:4] # coefficients for Level B
# (Intercept)           x 
#   -4.221196    1.496578

Which corresponds to

contrasts(datos$tipo)
#  B
#A 0
#B 1
#attributes(fit.lm$qr$qr)$contrasts
#$tipo
#[1] "contr.treatment"

but not those for lmp():

coefficients(fit.lmp)[1:2] + coefficients(fit.lmp)[3:4] # coefficients for Level A
# (Intercept)           x 
# -71.169640    1.993383 

coefficients(fit.lmp)[1:2] - coefficients(fit.lmp)[3:4] # coefficients for Level B
# (Intercept)           x 
#  -4.221196    1.496578

Why?

score 3 · Accepted Answer · answered Sep 02 '16 at 11:41

3

lmp is applying contr.sum rather than contr.treatment. You can obtain the same lm result by:

lm(y~x*tipo, data=datos, contrasts = list(tipo = "contr.sum"))
#Coefficients:
#(Intercept)            x        tipo1      x:tipo1  
#   -37.6954       1.7450     -33.4742       0.2484

answered Sep 02 '16 at 11:41

Zheyuan Li

71,365
17
180
248

Thanks again. But I specify center=FALSE. I think, that as center=TRUE is the default, they keep contr.sum even if the user states center=FALSE. I think both things are unfortunate and confusing, as the help page kind of suggests that lmp() acts as lm() except for the randomization. Also, any reason why the R-square prob is not found by randomization in lmp? The help pages states "Either permutation test p-values or the usual F-test p-values will be output", but I always get the F test for R-square (perhaps this should be another question). – user2955884 Sep 05 '16 at 07:05

lmPerm::lmp(y~x*f,center=TRUE) vs lm(y~x*f): very different coefficients

1 Answers1

lmPerm::lmp(y~xf,center=TRUE) vs lm(y~xf): very different coefficients