Post hoc for binary GLMM (lme4) and plot

Question

So I'm an R novice attempting a GLMM and post hoc analysis... help! I've collected binary data on 9 damselflys under 6 light levels, 1=response to movement of optomotor drum, 0=no response. My data was imported into R with the headings 'Animal_ID, light_intensity, response'. Animal ID (1-9) repeated for each light intensity (3.36-0.61) (see below)

Using the following code (lme4 package), I've performed a GLMM and found a light level to have a significant effect on response:

d = data.frame(id = data[,1], var = data$Light_Intensity, Response = data$Response)
model <- glmer(Response~var+(1|id),family="binomial",data=d)
summary(model)

Returns

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
Family: binomial  ( logit )
Formula: Response ~ var + (1 | Animal_ID)
Data: d

 AIC      BIC   logLik deviance df.resid 
  66       72      -30       60       51 

Scaled residuals: 
Min      1Q  Median      3Q     Max 
-3.7704 -0.6050  0.3276  0.5195  1.2463 

Random effects:
Groups    Name        Variance Std.Dev.
Animal_ID (Intercept) 1.645    1.283   
Number of obs: 54, groups:  Animal_ID, 9

Fixed effects:
        Estimate Std. Error z value Pr(>|z|)  
(Intercept)  -1.7406     1.0507  -1.657   0.0976 .
var           1.1114     0.4339   2.561   0.0104 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
(Intr)
var -0.846

Then running:

m1 <- update(model, ~.-var)
anova(model, m1, test = 'Chisq')

Returns

Data: d
Models:
m1: Response ~ (1 | Animal_ID)
model: Response ~ var + (1 | Animal_ID)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
m1     2 72.555 76.533 -34.278   68.555                            
model  3 66.017 71.983 -30.008   60.017 8.5388      1   0.003477 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I've installed the multcomp and lsmeans packages in an attempt at performing a Tukey post hoc to see where the difference is, but have run into difficulties with both.

Running:

summary(glht(m1,linfct=mcp("Animal_ID"="Tukey")))

Returns: "Error in mcp2matrix(model, linfct = linfct) : Variable(s) ‘Animal_ID’ have been specified in ‘linfct’ but cannot be found in ‘model’! "

Running:

lsmeans(model,pairwise~Animal_ID,adjust="tukey")

Returns: "Error in lsmeans.character.ref.grid(object = new("ref.grid", model.info = list( : No variable named Animal_ID in the reference grid"

I'm aware that I'm probably being very stupid here, but any help would be very much appreciated. My confusion is snowballing.

Also, does anyone have any suggestions as to how I might best visualize my results (and how to do this)?

Thank you very much in advance!

UPDATE:

New code-

Light <- c("3.36","3.36","3.36","3.36","3.36","3.36","3.36","3.36","3.36","2.98","2.98","2.98","2.98","2.98","2.98","2.98","2.98","2.98","2.73","2.73","2.73","2.73","2.73","2.73","2.73","2.73","2.73","2.15","2.15","2.15","2.15","2.15","2.15","2.15","2.15","2.15","1.72","1.72","1.72","1.72","1.72","1.72","1.72","1.72","1.72","0.61","0.61","0.61","0.61","0.61","0.61","0.61","0.61","0.61")
Subject <- c("1","2","3","4","5","6","7","8","9","1","2","3","4","5","6","7","8","9","1","2","3","4","5","6","7","8","9","1","2","3","4","5","6","7","8","9","1","2","3","4","5","6","7","8","9","1","2","3","4","5","6","7","8","9")
Value <- c("1","0","1","0","1","1","1","0","1","1","0","1","1","1","1","1","1","1","0","1","1","1","1","1","1","0","1","0","0","1","1","1","1","1","1","1","0","0","0","1","0","0","1","0","1","0","0","0","1","1","0","1","0","0")

data <- data.frame(Light, Subject, Value)

library(lme4)

model <- glmer(Value~Light+(1|Subject),family="binomial",data=data)
summary(model)

Returns:

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
glmerMod]
 Family: binomial  ( logit )
Formula: Value ~ Light + (1 | Subject)
   Data: data

     AIC      BIC   logLik deviance df.resid 
    67.5     81.4    -26.7     53.5       47 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6564 -0.4884  0.2193  0.3836  1.2418 

Random effects:
 Groups  Name        Variance Std.Dev.
 Subject (Intercept) 2.687    1.639   
Number of obs: 54, groups:  Subject, 9

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)  
(Intercept) -1.070e+00  1.053e+00  -1.016   0.3096  
Light1.72   -7.934e-06  1.227e+00   0.000   1.0000  
Light2.15    2.931e+00  1.438e+00   2.038   0.0416 *
Light2.73    2.931e+00  1.438e+00   2.038   0.0416 *
Light2.98    4.049e+00  1.699e+00   2.383   0.0172 *
Light3.36    2.111e+00  1.308e+00   1.613   0.1067  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
          (Intr) Lg1.72 Lg2.15 Lg2.73 Lg2.98
Light1.72 -0.582                            
Light2.15 -0.595  0.426                     
Light2.73 -0.595  0.426  0.555              
Light2.98 -0.534  0.361  0.523  0.523       
Light3.36 -0.623  0.469  0.553  0.553  0.508

Then running:

m1 <- update(model, ~.-Light)
anova(model, m1, test= 'Chisq')

Returns:

Data: data
Models:
m1: Value ~ (1 | Subject)
model: Value ~ Light + (1 | Subject)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  
m1     2 72.555 76.533 -34.278   68.555                           
model  7 67.470 81.393 -26.735   53.470 15.086      5       0.01 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Finally, running:

library(lsmeans)
lsmeans(model,list(pairwise~Light),adjust="tukey")

Returns (it actually works now!):

$`lsmeans of Light`
 Light    lsmean       SE df  asymp.LCL asymp.UCL
 0.61  -1.070208 1.053277 NA -3.1345922 0.9941771
 1.72  -1.070216 1.053277 NA -3.1345997 0.9941687
 2.15   1.860339 1.172361 NA -0.4374459 4.1581244
 2.73   1.860332 1.172360 NA -0.4374511 4.1581149
 2.98   2.978658 1.443987 NA  0.1484964 5.8088196
 3.36   1.040537 1.050317 NA -1.0180467 3.0991215

Results are given on the logit (not the response) scale. 
Confidence level used: 0.95 

$`pairwise differences of contrast`
 contrast         estimate       SE df z.ratio p.value
 0.61 - 1.72  7.933829e-06 1.226607 NA   0.000  1.0000
 0.61 - 2.15 -2.930547e+00 1.438239 NA  -2.038  0.3209
 0.61 - 2.73 -2.930539e+00 1.438237 NA  -2.038  0.3209
 0.61 - 2.98 -4.048866e+00 1.699175 NA  -2.383  0.1622
 0.61 - 3.36 -2.110745e+00 1.308395 NA  -1.613  0.5897
 1.72 - 2.15 -2.930555e+00 1.438239 NA  -2.038  0.3209
 1.72 - 2.73 -2.930547e+00 1.438238 NA  -2.038  0.3209
 1.72 - 2.98 -4.048874e+00 1.699175 NA  -2.383  0.1622
 1.72 - 3.36 -2.110753e+00 1.308395 NA  -1.613  0.5897
 2.15 - 2.73  7.347728e-06 1.357365 NA   0.000  1.0000
 2.15 - 2.98 -1.118319e+00 1.548539 NA  -0.722  0.9793
 2.15 - 3.36  8.198019e-01 1.302947 NA   0.629  0.9889
 2.73 - 2.98 -1.118326e+00 1.548538 NA  -0.722  0.9793
 2.73 - 3.36  8.197945e-01 1.302947 NA   0.629  0.9889
 2.98 - 3.36  1.938121e+00 1.529202 NA   1.267  0.8029

Results are given on the log odds ratio (not the response) scale. P value adjustment: tukey method for comparing a family of 6 estimates

You should be capitalization of the columns consistent. I think it would be necessary to have the actual data (for `data`) that could be used to run some tests. You would expect that `d` would not retain the column headings that you passed by value. — IRTFM, Sep 26 '18 at 20:24
You could name them the same as their source column names or you could just select the needed columns from the object with "[". — IRTFM, Sep 26 '18 at 20:43
@42- in `d` the column heading is 'id'. I have changed my code accordingly. However I still get the following errors: "Error in lsmeans.character.ref.grid(object = new("ref.grid", model.info = list( : No variable named id in the reference grid" and "Error in mcp2matrix(model, linfct = linfct) : Variable(s) ‘id’ have been specified in ‘linfct’ but cannot be found in ‘model’! " how can I resolve this? Thank you! — Lois Flounders, Sep 26 '18 at 20:47
Just use `data`. Don't transfer it to a separate dataset with different column names. Then you won't be confusing yourself. — IRTFM, Sep 26 '18 at 20:49
@42- The model won't run without using `d`. However I've edited the code so that 'd' now uses the same column names. I'm still getting the same errors, that there is no variable named 'Animal_ID' in the model / reference grid. I've edited my post to include `edit(model)`, can you see any reason I'd get these errors for the posthoc? Again, thanks so much with this - as you can tell I'm absolutely useless (but determined for a successful tukey!) — Lois Flounders, Sep 26 '18 at 21:01
I've lost track of what you have done. Please don't use comments to modify code or data. Only use [edit]. Warning: Some typical operations do not translate well to mixed models. — IRTFM, Sep 26 '18 at 21:06
Is this answer on CrossValidated useful? https://stats.stackexchange.com/questions/237512/how-to-perform-post-hoc-test-on-lmer-model — IRTFM, Sep 26 '18 at 21:08
Before the edit: you want differences by the light intensity, but your code is asking for differences by id, which is a random effect, and you can't compute lsmeans for a random effect. After the edit: that code all works; is there still a question?? — Aaron left Stack Overflow, Sep 26 '18 at 21:50
@Aaron I've now redone my code, which seems to work a lot better. However the outputs after running the Chisq are completely different from my old and new code. I'm thinking that my second lot of code is correct, and that the effect of light level on response is significant (P=0.01) (less significant than I'd previously thought based on the old code). I'd really appreciate if you could check over the new code and confirm I haven't done anything else silly. Again, thanks so much for your help. — Lois Flounders, Sep 26 '18 at 22:07
Well, the old code treated light as continuous and the new code treats it as categorical; which did you want? You gain power when you treat as continuous, assuming the effect really is increasing, so the smaller p-value isn't surprising. — Aaron left Stack Overflow, Sep 27 '18 at 21:57

score 2 · Answer 1 · answered Sep 29 '18 at 01:42

2

Your model specifies Animal_ID as a random effect. The glht and lsmeans functions work only for fixed-effect comparisons.

answered Sep 29 '18 at 01:42

Russ Lenth

5,922
2
13
21

Post hoc for binary GLMM (lme4) and plot

1 Answers1