0

I'm trying to build a model for identifying effects of question types and proficiency levels on giving a correct answer. my data looks like this.

> head(logistic_df, 20)
   ID proficiency    q_types answer
1   1          B2 +def,+spec      1
2   1          B2 +def,+spec      1
3   1          B2 +def,+spec      1
4   1          B2 +def,+spec      1
5   1          B2 +def,+spec      1
6   1          B2 +def,+spec      0
7   1          B2 +def,+spec      1
8   1          B2 +def,+spec      1
9   1          B2 +def,+spec      1
10  1          B2 +def,+spec      0
11  1          B2 +def,+spec      0
12  1          B2 +def,-spec      1
13  1          B2 +def,-spec      1
14  1          B2 +def,-spec      1
15  1          B2 +def,-spec      1
16  1          B2 +def,-spec      1
17  1          B2 +def,-spec      1
18  1          B2 +def,-spec      1
19  1          B2 +def,-spec      1
20  1          B2 -def,+spec      0

I want to understand whether the question types and proficiency levels have an effect on giving correct answers among the groups. For this I build a model

melr_int <- glmer(answer ~ (1 | ID) + proficiency * q_types,
                  family = binomial,
                  control=glmerControl(optimizer="bobyqa"),
                  data = logistic_df)

and also I build a simpler model and did a likelihood ratio test. There is no significant effect coming from the model with interaction so I need to use the simpler model.

code for the simple model and likelihood ratio test

melr <- glmer(answer ~ (1 | ID) + proficiency + q_types,
              family = binomial,
              control=glmerControl(optimizer="bobyqa"),
              data = logistic_df)

> lrtest(melr, melr_int)
Likelihood ratio test

Model 1: answer ~ (1 | ID) + proficiency + q_types
Model 2: answer ~ (1 | ID) + proficiency * q_types
  #Df  LogLik Df  Chisq Pr(>Chisq)
1  10 -1892.2                     
2  25 -1883.5 15 17.381     0.2966

But I want to identify that is there a difference between question types among groups. I did post hoc test with lsmeans function. Unfortunately it gave all possible combinations such as

 contrast                                 estimate     SE  df z.ratio p.value
 (L1 -def,-spec) - (A1 -def,-spec)        1.80e+01  8.633 Inf   2.081  0.9216
 (L1 -def,-spec) - (A2 -def,-spec)        1.72e+01  8.630 Inf   1.998  0.9477
 (L1 -def,-spec) - (B1 -def,-spec)        1.63e+01  8.631 Inf   1.886  0.9720
 (L1 -def,-spec) - (B2 -def,-spec)        1.59e+01  8.638 Inf   1.837  0.9793
 (L1 -def,-spec) - (C1-C2 -def,-spec)     1.38e+01  8.663 Inf   1.588  0.9969
 (L1 -def,-spec) - (L1 -def,+spec)        1.41e+01  8.664 Inf   1.623  0.9958
 (L1 -def,-spec) - (A1 -def,+spec)        1.74e+01  8.636 Inf   2.020  0.9415
 (L1 -def,-spec) - (A2 -def,+spec)        1.72e+01  8.630 Inf   1.998  0.9476
 (L1 -def,-spec) - (B1 -def,+spec)        1.69e+01  8.632 Inf   1.955  0.9583
 (L1 -def,-spec) - (B2 -def,+spec)        1.61e+01  8.640 Inf   1.864  0.9754

I just want to do comparisons like this

(L1 -def,-spec) - (L1 -def,+spec)
(L1 -def,-spec) - (L1 +def,-spec) 
(A1 -def,-spec) - (A1 -def,+spec)  
(A1 -def,-spec) - (A1 +def,-spec)

My question is what should I do for this comparison? Should I build multiple models for each proficiency level? And from the model comparison I can see that I need to use simpler model but with doing this how should I report the interactions?

0 Answers0