1

I want to create a regression table with modelsummary (amazing package!!!) for multinomial logistic models run with nnet::multinom that includes clustered standard errors, as well as corresponding "significance" stars and summary statistics.

Unfortunately, I cannot do this automatically with the vcov parameter within modelsummary because the sandwich package that modelsummary uses does not support nnet objects.

I was able to calculate robust standard errors with a customized function originally developed by Daina Chiba and modified by Davenport, Soule, Armstrong (available from: https://journals.sagepub.com/doi/suppl/10.1177/0003122410395370/suppl_file/Davenport_online_supplement.pdf).

I was also able to include these standard errors in the modelsummary table instead of the original ones. Yet, neither the "significance" stars nor the model summary statistics adapt to these new standard errors. I think this is because they are calculated via broom::tidy automatically by modelsummary.

I would be thankful for any advice for how to include stars and summary statistics that correspond to the clustered standard errors and respective p-values.

Another smaller question I have is whether there is any easy way of "spreading" the model statistics (e.g. number of observations or R2) such that they center below all response levels of the dependent variable and not just the first level. I am thinking about a multicolumn solution in Latex.

Here is some example code that includes how I calculate the standard errors. (Note, that the calculated clustered SEs are extremely small because they don't make sense with the example mtcars data. The only take-away is that the respective stars should correspond to the new SEs, and they don't).

# load data
dat_multinom <- mtcars
dat_multinom$cyl <- sprintf("Cyl: %s", dat_multinom$cyl)

# run multinomial logit model
mod <- nnet::multinom(cyl ~ mpg + wt + hp, data = dat_multinom, trace = FALSE)

# function to calculate clustered standard errors
mlogit.clust <- function(model,data,variable) {
  beta <- c(t(coef(model)))
  vcov <- vcov(model)
  k <- length(beta)
  n <- nrow(data)
  max_lev <- length(model$lev)
  xmat <- model.matrix(model)
  # u is deviance residuals times model.matrix
  u <- lapply(2:max_lev, function(x)
    residuals(model, type = "response")[, x] * xmat)
  u <- do.call(cbind, u)
  m <- dim(table(data[,variable]))
  u.clust <- matrix(NA, nrow = m, ncol = k)
  fc <- factor(data[,variable])
  for (i in 1:k) {
    u.clust[, i] <- tapply(u[, i], fc, sum)
  }
  cl.vcov <- vcov %*% ((m / (m - 1)) * t(u.clust) %*% (u.clust)) %*% vcov
  return(cl.vcov = cl.vcov)
}

# get coefficients, variance, clustered standard errors, and p values
b <- c(t(coef(mod)))
var <- mlogit.clust(mod,dat_multinom,"am")
se <- sqrt(diag(var))
p <- (1-pnorm(abs(b/se))) * 2

# modelsummary table with clustered standard errors and respective p-values
modelsummary(
  mod,
  statistic = "({round(se,3)}),[{round(p,3)}]",
  shape = statistic ~ response,
  stars = c('*' = .1, '**' = .05, '***' = .01)
)

# modelsummary table with original standard errors and respective p-values
modelsummary(
  models = list(mod),
  statistic = "({std.error}),[{p.value}]",
  shape = statistic ~ response,
  stars = c('*' = .1, '**' = .05, '***' = .01)
)

This code produces the following tables:

Model 1 / Cyl: 6 Model 1 / Cyl: 8
(Intercept) 22.759* -6.096***
(0.286),[0] (0.007),[0]
mpg -38.699 -46.849
(5.169),[0] (6.101),[0]
wt 23.196 39.327
(3.18),[0] (4.434),[0]
hp 6.722 7.493
(0.967),[0] (1.039),[0]
Num.Obs. 32
R2 1.000
R2 Adj. 0.971
AIC 16.0
BIC 27.7
RMSE 0.00

Note: ^^ * p < 0.1, ** p < 0.05, *** p < 0.01

Model 1 / Cyl: 6 Model 1 / Cyl: 8
(Intercept) 22.759* -6.096***
(11.652),[0.063] (0.371),[0.000]
mpg -38.699 -46.849
(279.421),[0.891] (448.578),[0.918]
wt 23.196 39.327
(210.902),[0.913] (521.865),[0.941]
hp 6.722 7.493
(55.739),[0.905] (72.367),[0.918]
Num.Obs. 32
R2 1.000
R2 Adj. 0.971
AIC 16.0
BIC 27.7
RMSE 0.00

Note: ^^ * p < 0.1, ** p < 0.05, *** p < 0.01

1 Answers1

0

This is not super easy at the moment, I just opened a Github issue to track progress. This should be easy to improve, however, so I expect changes to be published in the next release of the package.

In the meantime, you can install the dev version of modelsummary:

library(remotes)
install_github("vincentarelbundock/modelsummary")

Them, you can use the tidy_custom mechanism described here to override standard errors and p values manually:

library(modelsummary)

tidy_custom.multinom <- function(x, ...) {
    b <- coef(x)
    var <- mlogit.clust(x, dat_multinom, "am")
    out <- data.frame(
        term = rep(colnames(b), times = nrow(b)),
        response = rep(row.names(b), each = ncol(b)),
        estimate = c(t(b)),
        std.error = sqrt(diag(var))
    )
    out$p.value <- (1-pnorm(abs(out$estimate / out$std.error))) * 2
    row.names(out) <- NULL
    return(out)
}


modelsummary(
    mod,
    output = "markdown",
    shape = term ~ model + response,
    stars = TRUE)
Model 1 / Cyl: 6 Model 1 / Cyl: 8
(Intercept) 22.759*** -6.096***
(0.286) (0.007)
mpg -38.699*** -46.849***
(5.169) (6.101)
wt 23.196*** 39.327***
(3.180) (4.434)
hp 6.722*** 7.493***
(0.967) (1.039)
Num.Obs. 32
R2 1.000
R2 Adj. 0.971
AIC 16.0
BIC 27.7
RMSE 0.00
Vincent
  • 15,809
  • 7
  • 37
  • 39