5

I want to automate the generation of descriptive tables with headings for groups of variables - using knitr and (ideally) stargazer. Since I need weighted descriptives, I do not use stargazer's built in summary functions but generate a dataframe containing the statistics and use the summary=FALSE argument to print the dataframe.

Issue 1: A df with the variables and headings as rows and the summary statistics as columns does not work because stargazer transforms the NAs on the heading rows into $$s which breaks the knitting process.

Issue 2: As a work around, I generated a dataframe with variables and headings as columns and the summary statistics as rows and use the flip=TRUE argument to have rows and columns flipped in the stargazer output. While this allows me to have empty character vectors for the headings and numeric vectors for the variables, stargazer does not output the numeric vectors in math mode but (appears to) treat them as character.

Example:

# create example df
df <- data.frame(heading=c(" "," "," "),var1=c(1,2,3),var2=c(4,5,6))
df$heading <- as.character(df$heading)

# output using stargazer
stargazer(df, summary = FALSE, flip = TRUE)

% Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu
% Date and time: Fri, Aug 12, 2016 - 10:39:01
\begin{table}[!htbp] \centering 
  \caption{} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}} cccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & 1 & 2 & 3 \\ 
\hline \\[-1.8ex] 
heading &   &   &   \\ 
var1 & 1 & 2 & 3 \\ 
var2 & 4 & 5 & 6 \\ 
\hline \\[-1.8ex] 
\end{tabular} 
\end{table} 

Question: How do I add headings (empty rows) in the descriptive table and still get math mode output for the variable statistics?

CL.
  • 14,577
  • 5
  • 46
  • 73
fmerhout
  • 164
  • 2
  • 11
  • Stargazer simply generates atrocious LaTeX and is uncustomisable: in particular, there’s no way of telling Stargazer to do what you want, you’d need to modify the resulting LaTeX code. Don’t use this package — use another table generator, for instance Pander. – Konrad Rudolph Aug 12 '16 at 14:58

1 Answers1

3

As mentioned by Konrad Rudolph, stargazer probably cannot do this. The following solution uses xtable instead:

\documentclass{article}
\usepackage{array}

\begin{document}

<<results = "asis", echo = FALSE>>=
library(xtable)

group1 <- data.frame(
  name = c("v1", "v2"),
  mean = 1:2, min = 3:4, max = 5:6,
  stringsAsFactors = FALSE)
group2 <- data.frame(
  name = c("v3", "v4"),
  mean = -(1:2), min = -(3:4), max = -(5:6),
  stringsAsFactors = FALSE)

dat <- rbind(
  c("\\textbf{Group 1}", rep(NA, ncol(group1) - 1)),
  group1,
  c("\\textbf{Group 2}", rep(NA, ncol(group1) - 1)),
  group2)

colnames(dat) <- sprintf("\\multicolumn{1}{c}{%s}", colnames(dat))

print.xtable(
  xtable(dat,
         caption = "Summary of Groups 1 and 2.",
         align = c("l", "l", rep(">{$}r<{$}", 3))),
  include.rownames = FALSE,
  sanitize.text.function = identity,
  sanitize.colnames.function = identity)
@
\end{document}

The concept is quite simple, but there are a few quirks to take into account:

  • First, I generate sample data, assuming 2 groups with 2 variables each and 3 descriptives per variable.
  • When rbinding the groups, simply insert rows for the headings, setting empty columns to NA. Don't forget to double backslashes when using LaTeX in strings. (Use \multicolumn if the headings are too wide.)
  • As columns 2 to 4 will be set in math mode, we have to make sure that the column names will be printed as normal text. A "width-1-multicolumn" allows to alter the column type for just one cell, see here.
  • Use the align argument of xtable to specify the column type. We need one normal left-justified column and three right-justified columns in math mode. To force math mode, use >{$}r<{$}, see here. (There's an additional l which will be ignored – it is for the row names, which we hide.)
  • As we have LaTeX markup in the data, we need to turn off xtable's sanitizer. Therefore, set sanitize.text.function and sanitize.colnames.function to identity.

Result:

Result

Community
  • 1
  • 1
CL.
  • 14,577
  • 5
  • 46
  • 73
  • 1
    I love how the example values cannot possibly be real ^^ – AlexR Aug 12 '16 at 21:55
  • @AlexR Yeah … I made up the numbers first, and then I decided to put (arbitrary) labels on them. Not so clever. ;-) – CL. Aug 12 '16 at 22:19
  • Well, it made me smile at least :-) – AlexR Aug 12 '16 at 22:19
  • 1
    @AlexR Very good – now I have at least 1 reason for not correcting it (besides being lazy). On the other hand, if the data follows a [paranormal distribution](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2465539/figure/fig1/), my numbers could actually make sense. – CL. Aug 12 '16 at 22:22