I have individual-level data to analyze the effect of state-level educational expenditures on individual-level students' performances. Students' performance is a binary variable (0 when they do not pass, 1 when they pass the test). I run the following glm model with state-level clustering of standard errors:
library(miceadds)
df_logit <- data.frame(performance = c(0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0),
state = c("MA", "MA", "MB", "MC", "MB", "MD", "MA", "MC", "MB", "MD", "MB", "MC", "MA", "MA", "MA", "MA", "MD", "MA","MB","MA","MA","MD","MC","MA","MA","MC","MB","MB","MD", "MB"),
expenditure = c(123000, 123000,654000, 785000, 654000, 468000, 123000, 785000, 654000, 468000, 654000, 785000,123000,123000,123000,123000, 468000,123000, 654000, 123000, 123000, 468000,785000,123000, 123000, 785000, 654000, 654000, 468000,654000),
population = c(0.25, 0.25, 0.12, 0.45, 0.12, 0.31, 0.25, 0.45, 0.12, 0.31, 0.12, 0.45, 0.25, 0.25, 0.25, 0.25, 0.31, 0.25, 0.12, 0.25, 0.25, 0.31, 0.45, 0.25, 0.25, 0.45, 0.12, 0.12, 0.31, 0.1),
left_wing = c(0.10, 0.10, 0.12, 0.18, 0.12, 0.36, 0.10, 0.18, 0.12, 0.36, 0.12, 0.18, 0.10, 0.10, 0.10, 0.10, 0.36, 0.10, 0.12, 0.10, 0.10, 0.36, 0.18, 0.10, 0.10,0.18, 0.12, 0.12, 0.36, 0.12))
df_logit$performance <- as.factor(df_logit$performance)
glm_clust_1 <- miceadds::glm.cluster(data=df_logit, formula=performance ~ expenditure + population,
cluster="state", family=binomial(link = "logit"))
summary(glm_clust_1)
Since I cannot rule out the possibility that expenditures are endogenous, I would like to use the share of left-wing parties at the state level as an instrument for education expenditures.
However, I have not found a command in ivtools or other packages to run two-stage least squares with control variables in a logistic regression with state-level clustered standard errors.
Which commands can I use to extend my logit model with the instrument "left_wing" (also included in the example dataset) and at the same time output the common tests like the Wu-Hausman test or the weak instrument test (like ivreg does for ols)?
ideally, I could adapt the following command to binary dependent variables and cluster the standard errors at state level
iv_1 <- ivreg(performance ~ population + expenditure | left_wing + population, data=df_logit)
summary(iv_1, cluster="state", diagnostics = TRUE)