0

I am running a logistic regression using glm() and want to calculate standard errors using cluster.bs.glm() from clusterSEs.

The first bit of code throws an error:

mod1 <- glm(lfp ~ age + I(age^2) + genstat + married +
            isced + factor(syear) + 
            I(factor(syear):married), 
            data = subw, 
            family=binomial(link='logit'))

library(clusterSEs)
head(subw)
se <- cluster.bs.glm(mod=mod1, dat=subw, cluster= ~pid ,  boot.reps = 10)

Error in cl(dat, mod, clust)[ind.variables, 2] : subscript out of bounds

When I remove the interaction term there is no problem:

mod1 <- glm(lfp ~ age + I(age^2) + genstat + married +
            isced + factor(syear), 
            data = subw, 
            family=binomial(link='logit'))


se <- cluster.bs.glm(mod=mod1, dat=subw, cluster= ~pid ,  boot.reps = 10)

Is there a programming reason, why this should not work? Since glm reports all coefficients of the interaction term, some are NA, I'd expect the code above to work nevertheless.

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
yoland
  • 504
  • 4
  • 13

1 Answers1

1

It's tough to troubleshoot the example without a reproducible example. However, one potential solution would be to specify the interaction term outside the body of your model as Esarey does in his example on Github.

your_data <- your_data %>% mutate(your_interaction = var_1 * var_2)

mod1 <- glm(lfp ~ age + I(age^2) + genstat + married +
            isced + factor(syear) + your_interaction, 
            data = subw, 
            family=binomial(link='logit')) 
greg_s
  • 134
  • 1
  • 4