1

I need to create an interaction / marginal effects plot for a fixed effects model including clustered standard errors generated using the lfe "felm" command.

I have already created a function that achieves this. However, before I start using it, I wanted to double-check whether this function is correctly specified. Please find the function and a reproducible example below.

library(lfe)

### defining function
felm_marginal_effects <- function(regression_model, data, treatment, moderator, treatment_translation, moderator_translation, dependent_variable_translation, alpha = 0.05, se = NULL) {
        library(ggplot2)
        library(ggthemes)
        library(gridExtra)

        ### defining function to get average marginal effects
        getmfx <- function(betas, data, treatment, moderator) {
                betas[treatment] + betas[paste0(treatment, ":", moderator)] * data[, moderator]
        }

        ### defining function to get marginal effects for specific levels of the treatment variable
        getmfx_high_low <- function(betas, data, treatment, moderator, treatment_val) {
                betas[treatment] * treatment_val + betas[paste0(treatment, ":", moderator)] * data[, moderator] * treatment_val
        }

        ### Defining function to analytically derive standard error for marginal effects
        getvarmfx <- function(my_vcov, data, treatment, moderator) {
                my_vcov[treatment, treatment] + data[, moderator]^2 * my_vcov[paste0(treatment, ":", moderator), paste0(treatment, ":", moderator)] + 2 * data[, moderator] * my_vcov[treatment, paste0(treatment, ":", moderator)]
        }

        ### constraining data to relevant variables
        data <- data[, c(treatment, moderator)]

        ### getting marginal effects
        data[, "marginal_effects"] <- getmfx(coef(regression_model), data, treatment, moderator)

        ### getting  marginal effects for high and low cases of treatment variable
        data[, "marginal_effects_treatment_low"] <- getmfx_high_low(coef(regression_model), data, treatment, moderator, quantile(data[,treatment], 0.05))
        data[, "marginal_effects_treatment_high"] <- getmfx_high_low(coef(regression_model), data, treatment, moderator, quantile(data[,treatment], 0.95))

        ### getting robust SEs
        if (is.null(se)) {
                data$se <- getvarmfx(regression_model$vcv, data, treatment, moderator)
        } else if (se == "clustered") {
                data$se <- getvarmfx(regression_model$clustervcv, data, treatment, moderator)
        } else if (se == "robust") {
                data$se <- getvarmfx(regression_model$robustvcv, data, treatment, moderator)
        }

        ### Getting CI bounds
        data[, "ci_lower"] <- data[, "marginal_effects"] - abs(qt(alpha/2, regression_model$df, lower.tail = TRUE)) * sqrt(data$se)
        data[, "ci_upper"] <- data[, "marginal_effects"] + abs(qt(alpha/2, regression_model$df, lower.tail = TRUE)) * sqrt(data$se)

        ### plotting marginal effects plot
        p_1 <- ggplot(data, aes_string(x = moderator)) +
                geom_ribbon(aes(ymin = ci_lower, ymax = ci_upper), fill = "grey70", alpha = 0.4) +
                geom_line(aes(y = marginal_effects)) +
                theme_fivethirtyeight() +
                theme(plot.title = element_text(size = 11.5, hjust = 0.5), axis.title = element_text(size = 10)) +
                geom_rug() + 
                xlab(moderator_translation) +
                ylab(paste("Marginal effect of",treatment_translation,"on",dependent_variable_translation)) +
                ggtitle("Average marginal effects")

        p_2 <- ggplot(data, aes_string(x = moderator)) +
                geom_line(aes(y = marginal_effects_treatment_high, color = paste0("High ",treatment_translation))) +
                geom_line(aes(y = marginal_effects_treatment_low, color = paste0("Low ",treatment_translation))) +
                theme_fivethirtyeight() + 
                theme(plot.title = element_text(size = 11.5, hjust = 0.5), axis.title = element_text(size = 10), axis.title.y = element_blank(), legend.justification = c(0.95, 0.95), legend.position = c(1, 1), legend.direction = "vertical") +
                geom_rug() + 
                xlab(moderator_translation) +
                ylab(paste("Marginal effect of",treatment_translation,"on",dependent_variable_translation)) +
                ggtitle("Marginal effects at high / low levels of treatment") +
                scale_color_manual(name = NULL, values = c(rgb(229, 93, 89, maxColorValue = 255), rgb(75, 180, 184, maxColorValue = 255)), labels=c(paste0("High ",treatment_translation), paste0("Low ",treatment_translation)))

        ### exporting plots as combined grob
        return(grid.arrange(p_1, p_2, ncol = 2))
}

### example:
# example model (just for demonstration, fixed effects and cluster variables make little sense here)
model <- felm(mpg ~ cyl + am + cyl:am | carb | 0 | cyl, data = mtcars)

# creating marginal effects plot
felm_marginal_effects(regression_model = model, data = mtcars, treatment = "cyl", moderator = "am", treatment_translation = "Number of cylinders", moderator_translation = "Transmission", dependent_variable_translation = "Miles per (US) gallon")


The example output looks like this: enter image description here

Happy for any advice on how to make this a better, "well-coded", fast function so that it's more useful for others afterwards. However, I'm mostly looking to confirm whether it's "correct" in the first place.

Additionally, I wanted to check back with the community regarding some remaining questions, particularly:

  • Can I use the standard errors I generated for the average marginal effects for the "high" and "low" treatment cases as well or do I need to generate different standard errors for these cases? If so how?

  • Instead of using the analytically derived standard errors, I could also calculate bootstrapped standard errors by creating many coefficient estimates based on repeated sub-samples of the data. How would I generate bootstrapped standard errors for the high / low case?

  • Is there something about fixed effects models or fixed effects models with clustered standard errors that make marginal effects plots or anything else I did in the code fundamentally inadmissible?

PS.: The above function and questions are kind of an extension of How to plot marginal effect of an interaction after felm() function

Phil
  • 954
  • 1
  • 8
  • 22
  • As a general rule, standard errors, whether robust clustered bootstrapped whatsoever depend on the data. There's a large proportion of statistical issues in your question that should be better asked at [Cross Validated](https://stats.stackexchange.com/help/on-topic) or ask your local statistician. After you have clarified this, and since your code appears to be working, the code part of your question might be more on-topic on [Code Review](https://codereview.stackexchange.com/help/on-topic). – jay.sf Mar 23 '20 at 05:27
  • Hi@Phil I am trying to use your function for marginal effects in `felm` function. but I got the following error - `Must use a vector in `[`, not an object of class matrix.` Do you have any idea how I can fix it? – Sharif May 23 '20 at 21:02
  • Hi @Sharif. It's a bit difficult to judge by the error alone. But it looks as if you pass something in another data type than what the function expects. The function expects "regression_model" to be output of a felm function, "data" to be a data.frame, and most other attributes to be single character type strings. Are you perhaps passing a matrix to "data"? Since the whole function is posted above, you could try to go through the function step by step and see where it breaks exactly. Note that I am still not sure whether all of the operations are statistically sound! – Phil Jun 05 '20 at 22:14
  • Hi Phil, very useful function. I was just going through it to check whether it is "correct". Can you perhaps point me to where you got the formula for the "Defining function to analytically derive standard error for marginal effects" from? – broti Dec 06 '21 at 12:23

0 Answers0