2

I was wondering if there's a way to create a decent summary statistics table by multiple categories (groups) using stargazer(). I started with the following code but I am not sure how to advance from here.

library(mtcars)
mtcars2 <- within(mtcars, {
  vs <- factor(vs, labels = c("V", "S"))
  am <- factor(am, labels = c("automatic", "manual"))
  cyl  <- ordered(cyl)
  gear <- ordered(gear)
  carb <- ordered(carb)
})
summary1 = summary(mtcars2)
stargazer(summary1)

This gives me the following error:

Error in names(x) <- value : 'names' attribute [11] must be the same length as the vector [3]

Using stargazer() or other comparable packages, I want to make a summary statistics table categorized by transmission (am) and engine (vs) to be presented in the following way.

enter image description here

Vincent
  • 15,809
  • 7
  • 37
  • 39
jck21
  • 751
  • 4
  • 16
  • 1
    What have you tried using `stargazer()`? Could you include the code you have tried to highlight the stargazer coding problems you have? – Peter Dec 10 '21 at 09:28
  • @Peter I tried `stargazer(summary)` and got `Error in names(x) <- value : 'names' attribute [11] must be the same length as the vector [3]` But even if I resolve this issue still my code does not realize the table above. I want to know how to code before I apply `stargazer()`. – jck21 Dec 10 '21 at 12:21
  • 1
    Please include this code and error in the question as this is your coding problem. Where is `summary` defined, State how you resolved this issue and show what you get so it is easy to see what the problem(s) is(are). – Peter Dec 10 '21 at 12:29
  • @Peter, I just updated it. – jck21 Dec 10 '21 at 12:31

1 Answers1

4

You can achieve something like this with the datasummary function from the modelsummary package (disclaimer: I am the maintainer). You will find a detailed description and many examples on the packages's website.

Load the library and define a custom function to create a mean +/- sd. We use backticks because the name of our function includes spaces. Note that because of the unicode character, this may not work on Windows.

library(modelsummary)

`Mean ± SD` <- function(x) { 
    sprintf("%.0f ± %.0f", round(mean(x)), round(sd(x)))
}

Clean up variable labels:

dat <- mtcars
dat$am = ifelse(dat$am == 0, "automatic", "manual")
dat$vs = ifelse(dat$vs == 0, "v-shaped", "straight")

Finally, we use datasummary to create the table. A few things to note:

  • Rows go on the left side of the formula
  • Columns go on the right side of the formula
  • Statistics and variables joined by a + will be displayed one after the other.
  • Statistics and variables joined by a * will be "nested" inside one another.
  • Parentheses can be used to nest several variables/statistics
  • 1 is a shortcut for "all".

You will find detailed instructions and examples on the package website. This table can be exported to HTML, LaTeX, Word, and more using the output argument:

datasummary(
    mpg * (Min + Max + `Mean ± SD`) +
    hp * (Min + Max + `Mean ± SD`) +
    wt * (Min + Max + `Mean ± SD`) ~
    am * vs + 1,
    data = dat)

enter image description here

An alternative approach is to reshape the data beforehand. With this, you don't need to specify each variable individually in the formula:

library(tidyverse)

dat <- mtcars %>%
    select(mpg, wt, hp, am, vs) |>
    mutate(am = ifelse(am == 0, "automatic", "manual"),
           vs = ifelse(vs == 0, "v-shaped", "straight")) |>
    pivot_longer(cols = c("mpg", "wt", "hp"),
                 names_to = "variables")

datasummary(
    variables * (Min + Max + `Mean ± SD`) ~ value * am * vs + Heading(`All`) * value,
    data = dat)
Vincent
  • 15,809
  • 7
  • 37
  • 39