I need to create an interaction / marginal effects plot for a fixed effects model including clustered standard errors generated using the lfe "felm" command.
I have already created a function that achieves this. However, before I start using it, I wanted to double-check whether this function is correctly specified. Please find the function and a reproducible example below.
library(lfe)
### defining function
felm_marginal_effects <- function(regression_model, data, treatment, moderator, treatment_translation, moderator_translation, dependent_variable_translation, alpha = 0.05, se = NULL) {
library(ggplot2)
library(ggthemes)
library(gridExtra)
### defining function to get average marginal effects
getmfx <- function(betas, data, treatment, moderator) {
betas[treatment] + betas[paste0(treatment, ":", moderator)] * data[, moderator]
}
### defining function to get marginal effects for specific levels of the treatment variable
getmfx_high_low <- function(betas, data, treatment, moderator, treatment_val) {
betas[treatment] * treatment_val + betas[paste0(treatment, ":", moderator)] * data[, moderator] * treatment_val
}
### Defining function to analytically derive standard error for marginal effects
getvarmfx <- function(my_vcov, data, treatment, moderator) {
my_vcov[treatment, treatment] + data[, moderator]^2 * my_vcov[paste0(treatment, ":", moderator), paste0(treatment, ":", moderator)] + 2 * data[, moderator] * my_vcov[treatment, paste0(treatment, ":", moderator)]
}
### constraining data to relevant variables
data <- data[, c(treatment, moderator)]
### getting marginal effects
data[, "marginal_effects"] <- getmfx(coef(regression_model), data, treatment, moderator)
### getting marginal effects for high and low cases of treatment variable
data[, "marginal_effects_treatment_low"] <- getmfx_high_low(coef(regression_model), data, treatment, moderator, quantile(data[,treatment], 0.05))
data[, "marginal_effects_treatment_high"] <- getmfx_high_low(coef(regression_model), data, treatment, moderator, quantile(data[,treatment], 0.95))
### getting robust SEs
if (is.null(se)) {
data$se <- getvarmfx(regression_model$vcv, data, treatment, moderator)
} else if (se == "clustered") {
data$se <- getvarmfx(regression_model$clustervcv, data, treatment, moderator)
} else if (se == "robust") {
data$se <- getvarmfx(regression_model$robustvcv, data, treatment, moderator)
}
### Getting CI bounds
data[, "ci_lower"] <- data[, "marginal_effects"] - abs(qt(alpha/2, regression_model$df, lower.tail = TRUE)) * sqrt(data$se)
data[, "ci_upper"] <- data[, "marginal_effects"] + abs(qt(alpha/2, regression_model$df, lower.tail = TRUE)) * sqrt(data$se)
### plotting marginal effects plot
p_1 <- ggplot(data, aes_string(x = moderator)) +
geom_ribbon(aes(ymin = ci_lower, ymax = ci_upper), fill = "grey70", alpha = 0.4) +
geom_line(aes(y = marginal_effects)) +
theme_fivethirtyeight() +
theme(plot.title = element_text(size = 11.5, hjust = 0.5), axis.title = element_text(size = 10)) +
geom_rug() +
xlab(moderator_translation) +
ylab(paste("Marginal effect of",treatment_translation,"on",dependent_variable_translation)) +
ggtitle("Average marginal effects")
p_2 <- ggplot(data, aes_string(x = moderator)) +
geom_line(aes(y = marginal_effects_treatment_high, color = paste0("High ",treatment_translation))) +
geom_line(aes(y = marginal_effects_treatment_low, color = paste0("Low ",treatment_translation))) +
theme_fivethirtyeight() +
theme(plot.title = element_text(size = 11.5, hjust = 0.5), axis.title = element_text(size = 10), axis.title.y = element_blank(), legend.justification = c(0.95, 0.95), legend.position = c(1, 1), legend.direction = "vertical") +
geom_rug() +
xlab(moderator_translation) +
ylab(paste("Marginal effect of",treatment_translation,"on",dependent_variable_translation)) +
ggtitle("Marginal effects at high / low levels of treatment") +
scale_color_manual(name = NULL, values = c(rgb(229, 93, 89, maxColorValue = 255), rgb(75, 180, 184, maxColorValue = 255)), labels=c(paste0("High ",treatment_translation), paste0("Low ",treatment_translation)))
### exporting plots as combined grob
return(grid.arrange(p_1, p_2, ncol = 2))
}
### example:
# example model (just for demonstration, fixed effects and cluster variables make little sense here)
model <- felm(mpg ~ cyl + am + cyl:am | carb | 0 | cyl, data = mtcars)
# creating marginal effects plot
felm_marginal_effects(regression_model = model, data = mtcars, treatment = "cyl", moderator = "am", treatment_translation = "Number of cylinders", moderator_translation = "Transmission", dependent_variable_translation = "Miles per (US) gallon")
The example output looks like this:
Happy for any advice on how to make this a better, "well-coded", fast function so that it's more useful for others afterwards. However, I'm mostly looking to confirm whether it's "correct" in the first place.
Additionally, I wanted to check back with the community regarding some remaining questions, particularly:
Can I use the standard errors I generated for the average marginal effects for the "high" and "low" treatment cases as well or do I need to generate different standard errors for these cases? If so how?
Instead of using the analytically derived standard errors, I could also calculate bootstrapped standard errors by creating many coefficient estimates based on repeated sub-samples of the data. How would I generate bootstrapped standard errors for the high / low case?
Is there something about fixed effects models or fixed effects models with clustered standard errors that make marginal effects plots or anything else I did in the code fundamentally inadmissible?
PS.: The above function and questions are kind of an extension of How to plot marginal effect of an interaction after felm() function