I am trying to fit some data to a negative binomial model and run a pairwise comparison using emmeans
. The data has two different sample sizes, 15 and 20 (num_sample
in the example below).
I have set up two data frames: good.data
which produces the expected result of offset()
using random sample sizes between 15 and 20, and bad.data
using a sample size of either 15 or 20, which seems to produce a factor of either 15 or 20. The bad.data
pairwise comparison produces way too many comparisons compared to the good.data
, even though they should produce the same number?
set.seed(1)
library(dplyr)
library(emmeans)
library(MASS)
# make data that works
data.frame(site=c(rep("A",24),
rep("B",24),
rep("C",24),
rep("D",24),
rep("E",24)),
trt_time=rep(rep(c(10,20,30),8),5),
pre_trt=rep(rep(c(rep("N",3),rep("Y",3)),4),5),
storage_time=rep(c(rep(0,6),rep(30,6),rep(60,6),rep(90,6)),5),
num_sample=sample(c(15,17,20),24*5,T),# more than 2 sample sizes...
bad=sample(c(1:7),24*5,T,c(0.6,0.1,0.1,0.05,0.05,0.05,0.05)))->good.data
# make data that doesn't work
data.frame(site=c(rep("A",24),
rep("B",24),
rep("C",24),
rep("D",24),
rep("E",24)),
trt_time=rep(rep(c(10,20,30),8),5),
pre_trt=rep(rep(c(rep("N",3),rep("Y",3)),4),5),
storage_time=rep(c(rep(0,6),rep(30,6),rep(60,6),rep(90,6)),5),
num_sample=sample(c(15,20),24*5,T),# only 2 sample sizes...
bad=sample(c(1:7),24*5,T,c(0.6,0.1,0.1,0.05,0.05,0.05,0.05)))->bad.data
# fit models
good.data%>%
mutate(trt_time=factor(trt_time),
pre_trt=factor(pre_trt),
storage_time=factor(storage_time))%>%
MASS::glm.nb(bad~trt_time:pre_trt:storage_time+offset(log(num_sample)),
data=.)->mod.good
bad.data%>%
mutate(trt_time=factor(trt_time),
pre_trt=factor(pre_trt),
storage_time=factor(storage_time))%>%
MASS::glm.nb(bad~trt_time:pre_trt:storage_time+offset(log(num_sample)),
data=.)->mod.bad
# pairwise comparison
emmeans::emmeans(mod.good,pairwise~trt_time:pre_trt:storage_time+offset(log(num_sample)))$contrasts%>%as.data.frame()
emmeans::emmeans(mod.bad,pairwise~trt_time:pre_trt:storage_time+offset(log(num_sample)))$contrasts%>%as.data.frame()