3

As the heading describes, I am trying to to find a way to knit output from sjPlot::tab_model() and other HTML tables into a PDF document generated from LaTeX using knitr.

I am therefore looking for a an alternative function to tab_model() that generates a data frame or LaTeX table as output instead of HTML. Alternatively, a function that converts/inserts html tables to LaTeX code, or a data frames which I, in turn, can convert to a LaTeX table using knitr::kable() or papaja::apa_table().

Examples:

\documentclass{apa7}

\begin{document}

Either a an alternative function that provides LaTeX output:

<<echo=FALSE>>=
model <- lm(carb ~ mpg, mtcars) #Linear model on mtcars dataset
desc <- "alternative2tab_model"(model) # Creates LaTeX table of descriptives
@


Or a function that directly converts HTML to LaTeX:

<<echo=FALSE>>=
model <- lm(carb ~ mpg, mtcars) #Linear model on mtcars dataset
desc <- sjPlot::tab_model(model) # Creates HTML table of descriptives

"html2latex"(desc) # Some function that will convert HTML to LaTeX
@

Or a function that converts HTML to dataframe:

<<echo=FALSE>>=
model <- lm(carb ~ mpg, mtcars) # Linear model on mtcars dataset
desc <- sjPlot::tab_model(model) # Creates HTML table of descriptives

desc_df <- "html2dataframe"(desc) # Some function that will convert HTML to data frame:

papaja::apa_table(desc_df, format = "latex") # Convert data frame into APA-style LaTeX table:
@

\end{document}
Pål Bjartan
  • 793
  • 1
  • 6
  • 18

2 Answers2

2

Ok, so I found a solution sort of, using XML::readHTMLtable(). However, it requires some wrangling as readHTMLtable() does not accept tab_model()'s output directly. Essentially I had to get the list items from tab_model containing HTML code and add them to a character vector which could be run in readHTMLtable. readHTMLtable created a list of two itentical data frames, so one had to be dropped before running apa_table.

\documentclass{apa7}
\begin{document}
<<echo=FALSE, cache=TRUE>>=
m1 <- lm(carb ~ mpg, mtcars)
m1list <- tab_model(m1)
# Creating vector containing HTML code
m1vec <- c() 
for (i in 1:3) {
  m1vec[i] <- m1list[[i]]
}
# Reading into data frame.
m1df <- XML::readHTMLTable(m1vec, as.data.frame = TRUE)
# Data frame is retrieved as repeated items in list. Keeping only one.
m1df <- m1df[[1]]
# Setting column names correctly.
colnames(m1df) <- m1df[1,]
m1df <- m1df[-1,]
# Converting data frame into LaTeX code.
papaja::apa_table(m1df)
@
\end{document}

This solution likely works for this example only. It will take some time to figure how to create a generic function. Feel free to add any suggestions for improvements.

Pål Bjartan
  • 793
  • 1
  • 6
  • 18
2

I've been looking for a solution to this for a while and ended up writing a function that creates tex and pdf versions of sjPlot::tab_model() html tables: https://stackoverflow.com/a/65970391/1873521. The Github repo is here: https://github.com/gorkang/html2latex/

So, in this case:

model <- lm(carb ~ mpg, mtcars) #Linear model on mtcars dataset
sjPlot::tab_model(model, file = "temp.html") # Creates HTML table of descriptives

source("R/html2pdf.R")
html2pdf("temp.html", page_width = 12)

Which creates a tex and pdf version of the table.

enter image description here

Gorka
  • 3,555
  • 1
  • 31
  • 37