6

I've been busting my (non-r-savy) brains on a way to get R to produce the percentage of correct predictions for a binomial glmer model. I know this is not super informative statistically, but it is often reported; so I would like to report it as well.

DATA:

Dependent variable: Tipo, which has 2 values: 's' or 'p'. Bunch of factor predictors, not a single continuous variable. 2 random intercepts: the test subject, and the nouns s/he responded to

code used for the model:

model <- glmer(Tipo ~ agency + tense + 
               co2pr + pr2pr + socialclass + 
               (1|muestra) + (1|nouns), 
               data=datafile, family="binomial",
               control=glmerControl(optimizer="bobyqa"), 
               contrasts=c("sum", "poly"))

I know there is a function predict() which takes a model object and formulates predictions based upon that model, but I can't seem to make it work for me. I would appreciate if you would be willing to share the code.

Thanks in advance.

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
Jeroen Claes
  • 77
  • 1
  • 7

1 Answers1

9

In order to make predictions, you need a threshold (there is a whole literature [search for "ROC curve" or "AUC"] on this topic ...) Naively picking a 0.5 cutoff (which is a reasonable default if you don't know or want to assume anything about the relative cost of false positives vs. false negatives, or equivalently the value of sensitivity vs. specificity), then

p <- as.numeric(predict(model, type="response")>0.5)

should give predicted probabilities and convert them to 0 or 1 respectively. Then

mean(p==datafile$Tipo)

should give you the proportion correct.

table(p,datafile$Tipo)

should give you a predicted-vs-observed table.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453