I want to export a PDF file via R Markdown of the following table with numeric and factor variables using datasummary as per the below:
---
title: "R Notebook"
output:
html_document:
df_print: paged
html_notebook: default
pdf_document: default
---
Table 1 example:
```{r, warning=FALSE, message=FALSE, echo=FALSE}
library(tidyverse)
library(modelsummary)
library(kableExtra)
tmp <- mtcars[, c("mpg", "hp")]
tmp$class <- 0
tmp$class[15:32] <- 1
tmp$class <- as.factor(tmp$class)
tmp$region <- "A"
tmp$region[15:20] <- "B"
tmp$region[21:32] <- "C"
tmp$region <- as.factor(tmp$region)
## change position of varianbles
tmp <- tmp[,c("mpg","class","region","hp")]
# create a list with individual variables
# remove missing and rescale
tmp_scaled <- tmp
tmp_scaled$mpg <- scale(tmp_scaled$mpg)
tmp_scaled$hp <- scale(tmp_scaled$hp)
tmp_scaled_list <- lapply(tmp_scaled, na.omit)
tmp_scaled_list[2] <- list(NULL)
tmp_scaled_list[3] <- list(NULL)
N_alt <- function(x) paste0(N(x), ' (', round((as.numeric(N(x))/32)*100,digits=1), ')')
# create a table with `datasummary`
emptycol = function(x) " "
datasummary(mpg + class + region + hp ~ Heading("N (%)") * N_alt + Mean + SD + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol, data = tmp) %>%
column_spec(column = 6, image = spec_boxplot(tmp_scaled_list[c(1,4)])) %>%
column_spec(column = 7, image = spec_hist(tmp_scaled_list[c(1,4)]))
```
This is the current output I see when I knit to HTML:
I am facing 3 issues right now:
1-If I try to knit to PDF I get the following error message:
! Package siunitx Error: Invalid token 'N' in numerical input.
Error: LaTeX failed to compile test_table1.tex. See https://yihui.org/tinytex/r/#debugging for debugging tips. See test_table1.log for more info.
Any ideas about what this might be?
2-The boxplot and histograms are incorrect. They are repeated because there are only 2 numeric variables. How can I make sure the correct boxplot and histogram is displayed for each numeric variables and nothing is displayed for factor variables?
3-Do you know how I could move factor variables under numeric variables and create a header for 'Category' to include the levels of factor variables such like:
Category N(%) Mean SD Boxplot Histogram
mpg
class 0
1
region A
B
C
hp
Thanks very much!
-- Edit:
Regarding issue number 3, I am just missing 1 point. My code is:
library(modelsummary)
library(kableExtra)
tmp <- mtcars[, c("mpg", "hp")]
tmp$class <- 0
tmp$class[15:32] <- 1
tmp$class <- as.factor(tmp$class)
tmp$region <- 1
tmp$region[15:20] <- 2
tmp$region[21:32] <- 3
tmp$region <- as.factor(tmp$region)
tmp$class <- 0
tmp$region <- 0
## change position of varianbles
tmp <- tmp[,c("mpg","class","region","hp")]
# create a list with individual variables
# remove missing and rescale
tmp_scaled <- tmp
tmp_scaled$mpg <- scale(tmp_scaled$mpg)
tmp_scaled$hp <- scale(tmp_scaled$hp)
tmp_scaled_list <- lapply(tmp_scaled, na.omit)
tmp_scaled_list[2] <- list(NULL)
tmp_scaled_list[3] <- list(NULL)
N_alt = function(x) {
if (x %in% c(tmp$class)) {
paste0('[14 (43.8); 18 (56.3)]')
} else if (x %in% c(tmp$region)) {
paste0('[14 (43.8); 6 (18.8); 12 (37.5)]')
} else {
paste0('[32 (100)]')
}
}
Mean_alt = function(x) {
if (x %in% c(tmp$class, tmp$region)) {
paste0("")
} else {
mean(x)
}
}
# create a table with `datasummary`
emptycol = function(x) " "
datasummary(mpg + (`class [0,1]`= class) + (`region [A,B,C]`= region) + hp ~ Heading("N (%)") * N_alt + Heading("Mean") * Mean_alt + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol, data = tmp) %>%
column_spec(column = 4, image = spec_boxplot(tmp_scaled_list)) %>%
column_spec(column = 5, image = spec_hist(tmp_scaled_list))
My N_alt
function does not work properly. Does anyone know what I am missing here?