14

I am trying to use DT::datatable to output a nicely formatted, interactive table in R.

...only problem is that I want a heroku job to knit the document for me, and I've learned that RStudio and rmarkdown::render() use pandoc under the hood -- but pandoc doesn't ship in the stripped down R Buildpack for heroku.

Is there any way to get the old markdown engine (knitr:knit2html or markdown:markdownToHTML) to pass the javascript that powers datatable through? Or to be more precise, to generate the sample table below without using pandoc?

Here is a minimal example:

testing.Rmd

---
title: "testing"
output: html_document
---

this is a datatable table
```{r test2, echo=FALSE}
library(DT)
DT::datatable(
  iris, 
  rownames = FALSE,
  options = list(pageLength = 12, dom = 'tip')
)
```

this is regular R output
```{r}
head(iris)

```

knit_test.R

require(knitr)
knitr::knit2html('testing.Rmd')

generates:

this is a datatable table <!–html_preserve–>

<!–/html_preserve–>
this is regular R output

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

desired behavior: have my datatable come through (not <!–html_preserve–>)

what I've tried I looked at htmltools and the htmlPreserve stuff but couldn't figure out how to apply that here. did some crazy stuff with saveWidget that was not successful and does not bear repeating.

Thanks!

Andrew
  • 9,090
  • 8
  • 46
  • 59
  • 1
    There's also [Docverter](http://www.docverter.com/), kind of pandoc as a a service... – mb21 Jul 27 '15 at 07:06

2 Answers2

8

Here's a solution that uses the packages knitr, markdown, base64enc and htmltools. It's modelled on what happens internally in rmarkdown::render, but has no dependencies on pandoc. It generates a self-contained HTML file by default, or optionally copies all of the dependencies into a folder. With the latter, it assumes that all the CSS and JS files it depends on are uniquely named (i.e. it won't import both if two htmlwidgets both decide to call their css file style.css).

library("knitr")
library("htmltools")
library("base64enc")
library("markdown")
render_with_widgets <- function(input_file,
                                output_file = sub("\\.Rmd$", ".html", input_file, ignore.case = TRUE),
                                self_contained = TRUE,
                                deps_path = file.path(dirname(output_file), "deps")) {

  # Read input and convert to Markdown
  input <- readLines(input_file)
  md <- knit(text = input)
  # Get dependencies from knitr
  deps <- knit_meta()

  # Convert script dependencies into data URIs, and stylesheet
  # dependencies into inline stylesheets

  dep_scripts <-
    lapply(deps, function(x) {
      lapply(x$script, function(script) file.path(x$src$file, script))})
  dep_stylesheets <- 
    lapply(deps, function(x) {
      lapply(x$stylesheet, function(stylesheet) file.path(x$src$file, stylesheet))})
  dep_scripts <- unique(unlist(dep_scripts))
  dep_stylesheets <- unique(unlist(dep_stylesheets))
  if (self_contained) {
    dep_html <- c(
      sapply(dep_scripts, function(script) {
        sprintf('<script type="text/javascript" src="%s"></script>',
                dataURI(file = script))
      }),
      sapply(dep_stylesheets, function(sheet) {
        sprintf('<style>%s</style>',
                paste(readLines(sheet), collapse = "\n"))
      })
    )
  } else {
    if (!dir.exists(deps_path)) {
      dir.create(deps_path)
    }
    for (fil in c(dep_scripts, dep_stylesheets)) {
      file.copy(fil, file.path(deps_path, basename(fil)))
    }
    dep_html <- c(
        sprintf('<script type="text/javascript" src="%s"></script>',
                file.path(deps_path, basename(dep_scripts))),
        sprintf('<link href="%s" type="text/css" rel="stylesheet">',
                file.path(deps_path, basename(dep_stylesheets)))
    )
  }

  # Extract the <!--html_preserve--> bits
  preserved <- extractPreserveChunks(md)

  # Render the HTML, and then restore the preserved chunks
  html <- markdownToHTML(text = preserved$value, header = dep_html)
  html <- restorePreserveChunks(html, preserved$chunks)

  # Write the output
  writeLines(html, output_file)
}

This can be called like this:

render_with_widgets("testing.Rmd")

This should work for any htmlwidgets, even in combination. Example:

TestWidgets.Rmd

---
title: "TestWidgets"
author: "Nick Kennedy"
date: "5 August 2015"
output: html_document
---

First test a dygraph
```{r}
library(dygraphs)
dygraph(nhtemp, main = "New Haven Temperatures") %>% 
  dyRangeSelector(dateWindow = c("1920-01-01", "1960-01-01"))
```

Now a datatable
```{r}
library(DT)
datatable(iris, options = list(pageLength = 5))
```

```{r}
library(d3heatmap)
d3heatmap(mtcars, scale="column", colors="Blues")
```

And then from R

render_with_widgets("TestWidgets.Rmd")
Nick Kennedy
  • 12,510
  • 2
  • 30
  • 52
4

A little bit from a category some crazy stuff with saveWidget but if you can use XML package (you'll need cedar-14 for that) something like below should do the trick:

#' http://stackoverflow.com/q/31645528/1560062
#'
#' @param dt datatables object as returned from DT::datatable
#' @param rmd_path character path to the rmd template
#' @param libdir path to the directory with datatable static files
#' @param output_path where to write output file
#'
process <- function(dt, rmd_path, libdir, output_path) {

    widget_path <- tempfile()
    template_path <- tempfile()

    # Save widget and process Rmd template
    DT::saveWidget(dt, widget_path, selfcontained=FALSE)
    knitr::knit2html(input=rmd_path, output=template_path)

    # Parse html files
    widget <- XML::htmlParse(widget_path)
    template <- XML::htmlParse(paste0(template_path, ".html"))

    # Extract elements from the body of widget file
    widget_container <- XML::getNodeSet(
        widget, "/html/body/div[@id = 'htmlwidget_container']")
    body_scripts <- XML::getNodeSet(widget, "/html/body/script")

    # Make sure we point to the correct static dir
    # Using lapply purely for side effect is kind of
    # wrong but it is cheaper than a for loop if we use ::
    correct_libdir <- function(nodeset, attr_name) {
        lapply(nodeset, function(el) {
            src <- XML::xmlAttrs(el)[[attr_name]]
            XML::xmlAttrs(el)[[attr_name]] <- file.path(
                libdir, sub("^.*?/", "", src))
        })
        nodeset
    }

    # Extract script and link tags, correct paths
    head_scripts <- correct_libdir(
        XML::getNodeSet(widget, "/html/head/script"), "src")

    head_links <- correct_libdir(
        XML::getNodeSet(widget, "/html/head/link"), "href")

    # Get template root    
    root <- XML::xmlRoot(template)

    # Append above in the right place
    root[[2]] <- XML::addChildren(root[[2]], widget_container)
    root[[2]] <- XML::addChildren(root[[2]], body_scripts)
    root[[1]] <- XML::addChildren(root[[1]], head_scripts)
    root[[1]] <- XML::addChildren(root[[1]], head_links)

    # Write output
    XML::saveXML(template, output_path)
}
zero323
  • 322,348
  • 103
  • 959
  • 935
  • This didn't work for me. The scripts don't get loaded because at least on Firefox on Windows, absolute filenames don't work as the 'src' attribute. The JSON for the widgets also gets mangled. – Nick Kennedy Aug 03 '15 at 22:47
  • It doesn't have to be absolute path but in a server environment it is most likely what you want. Dependencies are constant and there is no reason to keep and serve separate one for each generated document. I am not sure what you mean by mangled JSON. – zero323 Aug 05 '15 at 10:56
  • `markdown::markdownToHTML` replaces some of the characters with HTML entities and so the JSON is no longer valid. If you look at my answer, I've used `htmltools:extractPreserveChunks` to pull the existing HTML out first and then restore it later; this is the method used by `rmarkdown`. I agree in principle with not inlining the scripts and stylesheets to save space, although this is what is done by default in `rmarkdown`, and it makes for a more portable end result. Looking at it again, your code should work with relative paths, so the use of absolute paths was my issue. – Nick Kennedy Aug 05 '15 at 11:03
  • `DT::saveWidget` produces a valid JSON and there is no place where it is touched by `markdown` after that. – zero323 Aug 05 '15 at 11:14
  • Sorry, you're correct. I was using the .Rmd as provided and had assumed without properly working through your code that the `datatable` within the Rmd was being replaced with the one generated by `saveWidget`. Looking again, I can see that it is appended to the end, so in my html output I have one version which has the mangled JSON and one with preserved JSON. I've now looked again using relative paths, and still have the issue that the names of the required js and css files don't match the ones that are in the DT/htmlwidgets/lib folder. Where are you getting the static files? – Nick Kennedy Aug 05 '15 at 11:24
  • Full list of dependencies is generated when you call `saveWidget` (datatables, htmlwidget, jquery) and can be stored once in a cached directory on a server. I admit it is rather unjustified assumption about the requirements. It is easy to adjust it though. Your approach looks interesting. XML transformations, although extremely powerful, are rather tedious. – zero323 Aug 05 '15 at 11:41
  • 2
    Thanks. I found it interesting pulling apart `rmarkdown::render`, though the use of `output_format`s which have pre and post processing functions made it a little harder to follow. I was aiming for something as generalisable as possible. I think your technique (now that I understand how the parameters need to be set) looks good too. It's always more interesting on SO when there are two or more answers that tackle a problem with different approaches! – Nick Kennedy Aug 05 '15 at 11:48