5

I am working with RMarkdown and trying to use kable package. I have a three-variable data frame: gender (factor), age_group (factor), and test_score(scale). I want to create two-way tables with factor-variables (gender and age_groups) as table rows and columns, and summary statistics of test_scores as cell contents. These summary statistics are mean, standard deviation, and percentiles (median, 1st decile, 9th decile, and 99th percentile). Is there an easy way of building those tables in a beautiful way (like with kable package), without needing to input all those values into a matrix first? I searched the kable help file, but could not find how to do it.

# How my data looks like:

gender <- rep(c(rep(c("M", "F"), each=3)), times=3)
age <- as.factor(rep(seq(10,12, 1), each=6))
score <- c(4,6,8,4,8,9,6,6,9,7,10,13,8,9,13,12,14,16)
testdata <-data.frame(gender,age,score)


| gender | age | score |
|--------|-----|-------|
| M      | 10  | 4     |
| M      | 10  | 6     |
| M      | 10  | 8     |
| F      | 10  | 4     |
| F      | 10  | 8     |
| F      | 10  | 9     |
| M      | 11  | 6     |
| M      | 11  | 6     |
| M      | 11  | 9     |
| F      | 11  | 7     |
| F      | 11  | 10    |
| F      | 11  | 13    |
| M      | 12  | 8     |
| M      | 12  | 9     |
| M      | 12  | 13    |
| F      | 12  | 12    |
| F      | 12  | 14    |
| F      | 12  | 16    |

I would like a table that looks like below (but calculated directly from my dataset and with a beautiful publishing format):

      Mean score by gender & age
|        | 10yo | 11yo | 12yo | Total |
|--------|:----:|:----:|:----:|:-----:|
| Male   |   6  |   7  |  10  |  7.7  |
| Female |   7  |  10  |  14  |  10.3 |
| Total  |  6.5 | 88.5 |  12  |   9   |

I tried to use kable package, which indeed provided me some beautiful tables (nicely formatted), but I am only able to produce frequency tables with it. But I cannot find any argument in it to choose for summaries of variables. If anyone has a suggestion of a better package to build a table like above specified, I would appreciate it a lot.

kable(data, "latex", booktabs = T) %>%
   kable_styling(latex_options = "striped")
LuizZ
  • 945
  • 2
  • 11
  • 23
  • Do you want one table for each summary statistic? Or all of those stats in every cell? That second options seems like it would be a cluttered table. – Thomas Rosa May 16 '20 at 19:27
  • Hello LuizZ, welcome to Stackoverflow. Please read [how to create a minimal reproducible example](https://stackoverflow.com/help) and update your question. You'll get better help, faster if you provide a reproducible example that includes a subset of your data, the code you've already tried, and an explanation of the desired results. – Len Greski May 16 '20 at 20:42
  • Hello, Thomas and Len, thank you for your feedback. I will try to update my question with a reproducible example, including a fictitious subset of my data – LuizZ May 17 '20 at 15:09

1 Answers1

4

Absent a reproducible example, multi-way tables including a variety of statistics can be created with the tables::tabular() function.

Here is an example from the tables documentation, page 38 that illustrates multiple variables in a table that prints means and standard deviations.

set.seed(1206)

q <- data.frame(p = rep(c("A","B"),each = 10,len = 30), 
                a = rep(c(1,2,3),each = 10),
                id = seq(30),
                b = round(runif(30,10,20)),
                c = round(runif(30,40,70)))
library(tables)
tab <- tabular((Factor(p)*Factor(a)+1) ~ (N = 1) + (b + c) * (mean + sd),
               data = q)
tab[ tab[,1] > 0, ]

A Stackoverflow friendly version of the output is:

          b           c          
 p a   N  mean  sd    mean  sd   
 A 1   10 14.40 3.026 55.70 6.447
   3   10 14.50 2.877 52.80 8.954
 B 2   10 14.40 2.836 56.30 7.889
   All 30 14.43 2.812 54.93 7.714
>

One can render the table to HTML with the html() function. The output from the following code, when rendered in an HTML browser looks like the following illustration.

html(tab[ tab[,1] > 0, ])

enter image description here

tables includes capabilities to calculate other statistics, including quantiles. For details on quantile calculations, see pp. 29 - 30 of the tables package manual.

The package also works with knitr, kable, and kableExtra.

Len Greski
  • 10,505
  • 2
  • 22
  • 33
  • 1
    Thanks a lot for your answer, Len! It is exactly this type of table that I am looking for. I already voted your answer as useful, but because of my "low reputation" here, my vote does not appear! – LuizZ May 17 '20 at 15:13