ICC per random factor for multilevel negative binomial regression

Question

I have a multi-level negative binomial model fit with brms (library(brms))

fit1 <- brm(TOTAL_VIOLATIONS ~ LN_POP + Source_binary + Source_purchased + (1|TYPE_consolidated) + (1|COUNTY), data = Data, family = negbinomial())

This is what the data looks like:

> dput(droplevels(Data[1:20, c(3, 9, 20, 21, 22, 23)]))
structure(list(COUNTY = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L), .Label = c("ALAMEDA", 
"ALPINE", "AMADOR"), class = "factor"), TYPE_consolidated = structure(c(9L, 
6L, 3L, 2L, 5L, 7L, 1L, 1L, 4L, 12L, 1L, 1L, 1L, 1L, 8L, 10L, 
6L, 5L, 11L, 2L), .Label = c("City", "County Water District", 
"CSA", "CSD", "IOU", "MHP", "MUD", "MWC", "PA", "Private", "PUD", 
"Special Act District"), class = "factor"), TOTAL_VIOLATIONS = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 
8L, 0L, 0L), LN_POP = c(3.91202300542815, 6.29710931993394, 6.21260609575152, 
12.7367008965923, 10.9852927228879, 14.1374128813017, 11.9290007521904, 
11.1991321074213, 11.2374881189349, 12.332000202128, 10.2255710517052, 
6.10924758276437, 6.62007320653036, 6.21460809842219, 3.91202300542815, 
3.2188758248682, 4.24849524204936, 7.88231491898027, 8.96839619119826, 
4.91265488573605), Source_binary = structure(c(1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L
), .Label = c("GW", "SW"), class = "factor"), Source_purchased = structure(c(1L, 
1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L), .Label = c("No", "Purchased"), class = "factor")), row.names = c(NA, 
20L), class = "data.frame")

I want to explore the value of including first random effect (TYPE_consolidated) versus the same model with the second random intercept only (COUNTY) but I can't for the life of me figure out how to get icc() to report this information by group using by_group. The output is the exact same with or without this argument (see below). I have a feeling this is because it is a brms object as according to the help the output is different for these models but I have yet to figure out another way to get this. Does anyone know a way to get the variance ratio for the individual random effects (or alternatively can see what I'm doing wrong with by_group)? If not, is there a standard way to compare ICC between nested models? If so perhaps I could just calculate it for two different versions of my model and do that instead?

> performance::icc(fit1, by_group = TRUE)
# Random Effect Variances and ICC

Conditioned on: all random effects

## Variance Ratio (comparable to ICC)
Ratio: 0.94  CI 95%: [0.80 0.99]

## Variances of Posterior Predicted Distribution
Conditioned on fixed effects:   7.39  CI 95%: [ 2.50  20.03]
Conditioned on rand. effects: 117.57  CI 95%: [59.15 331.86]

## Difference in Variances
Difference: 109.10  CI 95%: [50.23 320.86]

> performance::icc(fit1)
# Random Effect Variances and ICC

Conditioned on: all random effects

## Variance Ratio (comparable to ICC)
Ratio: 0.94  CI 95%: [0.79 0.99]

## Variances of Posterior Predicted Distribution
Conditioned on fixed effects:   7.42  CI 95%: [ 2.48  20.19]
Conditioned on rand. effects: 117.90  CI 95%: [59.53 349.90]

## Difference in Variances
Difference: 109.71  CI 95%: [51.20 340.30]```

Hi KDobb, welcome to Stack Overflow. Not many of us are experts in binary regression models. However, a lot of us are very good with basic R issues and can inspect source code. We will be much more likely to be able to apply our general knowledge to your problem if we can execute your code. Therefore, I recommend 1) providing at least a sample of your data with `dput(Data)`; provide the output by [**editing your question**](https://stackoverflow.com/posts/62052764/edit). 2) It is always best to provide the library you're using for non-base packages, in this case `library(brm)`? — Ian Campbell, May 27 '20 at 21:12
Hi KDobb, as Ian said, we need a bit more information so that we can run the code ourselves and try some things out, mainly some version of the data. I use `brms` for my own research, but I have never compared random effects structures using Intraclass Correlation Coefficients. Is there a reason you don't wish to compare the models using `waic()` or `loo()` to see if the additional random intercept improves out-of-sample prediction? — sjp, May 27 '20 at 21:53
Hi Ian and sjp - Obviously I'm new to stack overflow. I actually posted a long time ago and got similar feedback but in a much less friendly manner with zero resources as to how I could improve and ended up deleting it and not coming back so I appreciate your patience and pointers. I did my best to improve my question. Sjp I have compared the models with both and maybe I should just leave it at that but from my reading ICC seemed like the way to test a random effect specifically but maybe that is trying to translate frequentist into bayesian. — KDobb, May 27 '20 at 22:15
Alright, so I've fit your model with the data you provided and I'm not getting the the same, or even remotely similar output when I run `icc()`. I have however, read this in the function's documentation. "It is then possible to compare variances across models, also by specifying different group-level terms via the re.form-argument." Unfortunately, like many R packages, they don't give any additional information or example about what that is supposed to look like, so I'm unsure how that formula should look. This gives you something to look into anyways. Best of luck! — sjp, May 28 '20 at 01:00

ICC per random factor for multilevel negative binomial regression

0 Answers0