1

I regularly use emmeans to calculate custom contrasts scross a wide range of statistical models. One of its strengths is its versatility: it is compatible with a huge range of packages. I have recently discovered that emmeans is compatible with the brms package, but am having trouble getting it to work. I will conduct an example multinomial logistic regression analysis use a dataset provided here. I will also conduct the same analysis in another package (nnet) to demonstrate what I need.

library(brms)
library(nnet)
library(emmeans)

# read in data
ml <- read.dta("https://stats.idre.ucla.edu/stat/data/hsbdemo.dta")

The data set contains variables on 200 students. The outcome variable is prog, program type, a three-level categorical variable (general, academic, vocation). The predictor variable is social economic status, ses, a three-level categorical variable. Now to conduct the analysis via the nnet package nnet

# first relevel so 'academic' is the reference level
ml$prog2 <- relevel(ml$prog, ref = "academic")

# run test in nnet
test_nnet <- multinom(prog2 ~ ses, 
                      data = ml)

Now run the same test in brms

# run test in brms (note: will take 30 - 60 seconds)
test_brm <- brm(prog2 ~ ses,
                data = ml,
                family = "categorical")

I will not print the output of the two models but the coefficients are roughly equivalent in both

Now to create an emmeans object that will allow us to conduct pariwise tests

# pass into emmeans
rg_nnet <- ref_grid(test_nnet)
em_nnet <- emmeans(rg_nnet,
                   specs = ~prog2|ses)

# regrid to get coefficients as logit
em_nnet_logit <- regrid(em_nnet,
                        transform = "logit")

em_nnet_logit

# output
# ses = low:
#   prog2      prob    SE df lower.CL upper.CL
# academic -0.388 0.297  6   -1.115   0.3395
# general  -0.661 0.308  6   -1.415   0.0918
# vocation -1.070 0.335  6   -1.889  -0.2519
# 
# ses = middle:
#   prog2      prob    SE df lower.CL upper.CL
# academic -0.148 0.206  6   -0.651   0.3558
# general  -1.322 0.252  6   -1.938  -0.7060
# vocation -0.725 0.219  6   -1.260  -0.1895
# 
# ses = high:
#   prog2      prob    SE df lower.CL upper.CL
# academic  0.965 0.294  6    0.246   1.6839
# general  -1.695 0.363  6   -2.582  -0.8072
# vocation -1.986 0.403  6   -2.972  -0.9997
# 
# Results are given on the logit (not the response) scale. 
# Confidence level used: 0.95 

So now we have our lovely emmeans() object that we can use to perform a vast array of different comparisons.

However, when I try to do the same thing with the brms object, I don't even get past the first step of converting the brms object into a reference grid before I get an error message

# do the same for brm
rg_brm <- ref_grid(test_brm)

Error : The select parameter is not predicted by a linear formula. Use the 'dpar' and 'nlpar' arguments to select the parameter for which marginal means should be computed.
Predicted distributional parameters are: 'mugeneral', 'muvocation'
Predicted non-linear parameters are: ''
Error in ref_grid(test_brm) : 
  Perhaps a 'data' or 'params' argument is needed

Obviously, and unsurprisingly, there are some steps I am not aware of to get the Bayesian software to play nicely with emmeans. Clearly there are some extra parameters I need to specify at some stage of the process but I'm not sure if these need to be specified in brms or in emmeans. I've searched around the web but am having trouble finding a simple but thorough guide.

Can anyone who knows how, help me to get the brms model into an emmeans object?

llewmills
  • 2,959
  • 3
  • 31
  • 58
  • Did you try, for example, ref_grid(test_brm, select = "mugeneral")? That's what the error message seems to be suggesting. Multinomial models require some special sauce, and are probably not fully supported unless the brms developer wants to add provisions for it. – Russ Lenth Oct 07 '21 at 13:03
  • Thanks @Russ Lenth. I'm so glad you weighed in. I got `rg_brm <- ref_grid(test_brm, dpar = c("mugeneral"))` to at least run but, after passing it through `emmeans(rg_brm, specs = ~ses)` it threw the warning message: `In model.matrix.default(terms, data, ...) : variable 'prog2' is absent, its contrast will be ignored` yielding coefs for `low`, `middle`, and `high` of -0.171, -0.786, -1.56 but I'm not sure what these represent. When i ran `emmeans(rg_brm, specs = ~prog2|ses)` I got `Error in emmeans(rg_brm, specs = ~prog2 | ses) : No variable named prog2 in the reference grid`. Any thoughts? – llewmills Oct 07 '21 at 14:02
  • Ignore the warning and leave out prog2. That warning happens when there is unnecessary contrast information. – Russ Lenth Oct 07 '21 at 14:57
  • It's just that is the contrast I specified in the `nnet` version and it gave me that lovely matrix of logit coefficients for each level of the outcome at each level of the predictor. At the moment running `rg_brm <- ref_grid(test_brm, dpar = c("mugeneral"))` and `emmeans(rg_brm, specs = ~ses)` gives me coefficients only for "mugeneral", whereas I would like to get coefficients for "academic" and "vocational" as well. – llewmills Oct 07 '21 at 20:48
  • As I said earlier, you likely can't do that with what is provided. – Russ Lenth Oct 07 '21 at 20:50
  • Ok good to know. Thanks so much for helping me out :) So do you know what the coefficients I listed two comments back represent? Log-odds of general vs academic (reference) at each level of SES maybe? – llewmills Oct 07 '21 at 21:00
  • If it's like multi or, it is log prob[i] - log prob]1] for i=1,2,...k, k being the number of multi nominal responses. – Russ Lenth Oct 07 '21 at 21:43
  • So we don't know if it's log odds or log probability? Maybe I should take this to the stan message boards. – llewmills Oct 07 '21 at 22:02
  • I imagine it is documented in brms. That should be the first place to look before posting more questions. – Russ Lenth Oct 08 '21 at 14:12

0 Answers0