5

I am writing scientific reports in bookdown and I would like to use non-breaking spaces as thousands separators follwoing the SI/ISO 31-0 standard.

Actually, I would prefer the non-breaking thin space (U+202F/ ) but for simplicty let's consider U+00A0/  for here.

I setup a knitr hook to do this on the fly:

knitr::knit_hooks$set(inline=function(output)
                               ifelse(is.numeric(output),
                                      prettyNum(round(output, 1),
                                                big.mark=' '),
                                      output))

This works as intended as long as I don't use any inline R-expressions returning numerical output > 999 within math expressions.

The following bookdown MWE illustrates the problem:

---
output:
  bookdown::html_document2: default
---
```{r set-output-hook, include=FALSE}
knitr::knit_hooks$set(inline=function(output)
                               ifelse(is.numeric(output),
                                      prettyNum(round(output, 1),
                                                big.mark=' '),
                                      output))
```

This works:
The product of $\pi$ and `r 1000` is `r pi*1000`.

This fails to render: 
$\pi\cdot`r 1000`=`r pi*1000`$

This renders but is cumbersome as it requires me to know *a priori* which
values might exceed 999:
$\pi\cdot1000=`r as.character(round(pi*1000, 1))`$

I tried to track it down and came up with the following rmarkdown MWE:

---
output:
  rmarkdown::html_document:
    keep_md: true
---

| Rmarkdown    | Render     | HTML                                                | Markdown     |
|--------------|------------|-----------------------------------------------------|--------------|
| `1000`       | 1000       |`1000`                                               | `1000`       |
|`$1000$`      |$1000$      |`<span class="math inline">\(1000\)</span>`          |`$1000$`      |
|              |            |                                                     |              |
|  `100,0`     | 100,0      |`100,0`                                              | `100,0`      |
|`$100,0$`     |$100,0$     |`<span class="math inline">\(100,0\)</span>`         |`$100,0$`     |
|              |            |                                                     |              |
|  `100 0`     | 100 0      |`100 0`                                              | `100 0`      |
|`$100 0$`     |$100 0$     |`<span class="math inline">\(100 0\)</span>`         |`$100 0$`     |
|              |            |                                                     |              |
|  `100&nbsp;0`| 100&nbsp;0 |`100 0`                                              | `100&nbsp;0` |
|`$100&nbsp;0$`|$100&nbsp;0$|`<span class="math inline">\(100&amp;nbsp;0\)</span>`|`$100&nbsp;0$`|

The first two columns of the table are sufficient to see the problem: Each pair of rows shows the number 1000 1 000) in text and math context; without any space, with a comma, with a simple space, and with a non-breaking space as thousands separator. The latter fails to render in math context.

To track down the problem, I inspected the resulting HTML and Markdown (keep_md: true) output and added the corresponding code as columns three and four for a better overview what's going on.

For clarity, here is an adjusted version of the above rmarkdown MWE replacing simple spaces by _ and non-breaking spaces by - in the HTML and Markdown output columns:

---
output:
  rmarkdown::html_document:
    keep_md: true
---

| Rmarkdown    | Render     | HTML                                                | Markdown     |
|--------------|------------|-----------------------------------------------------|--------------|
| `1000`       | 1000       |`1000`                                               | `1000`       |
|`$1000$`      |$1000$      |`<span_class="math_inline">\(1000\)</span>`          |`$1000$`      |
|              |            |                                                     |              |
|  `100,0`     | 100,0      |`100,0`                                              | `100,0`      |
|`$100,0$`     |$100,0$     |`<span_class="math_inline">\(100,0\)</span>`         |`$100,0$`     |
|              |            |                                                     |              |
|  `100 0`     | 100 0      |`100_0`                                              | `100_0`      |
|`$100 0$`     |$100 0$     |`<span_class="math_inline">\(100_0\)</span>`         |`$100_0$`     |
|              |            |                                                     |              |
|  `100&nbsp;0`| 100&nbsp;0 |`100-0`                                              | `100&nbsp;0` |
|`$100&nbsp;0$`|$100&nbsp;0$|`<span_class="math_inline">\(100&amp;nbsp;0\)</span>`|`$100&nbsp;0$`|

So from what I can tell

  1. This is not a bookdown issue as it can be reproduced by plain rmarkdown.
    • I'm just mentioning bookdown as I would be happy with a bookdown-specific work-around.
  2. This is not an rmarkdown issue, as the generated Markdown looks exactly as I would expect it to look like.
    • I'm just mentioning rmarkdown as I would be happy with an rmarkdown-specific work-around.
  3. This is not a MathJax issue, as the HTML code has the plain & replaced by &amp; and I would not expect that to render properly.
    • Anyways,I would be happy with an MathJax-related work-around.
  4. I suspect it's pandoc that replaces & by &amp; in code and math context but not in text context.
    • I'm sure if there is a way to convince pandoc not to do this, it will be easy to configure this through the rmarkdown YAML header.

Any idea on how to get the &nbsp; transferred literally from Markdown to HTML in math context would probably help me to figure out the rest.


Addendum:

As pointed out by @tarleb, $100&nbsp;0$ is not valid Latex. However, modifiying the HTML manually to contain \(100&nbsp;0\) works just fine as MathJax treats non-breaking spaces as spaces. As I am not concerned about PDF output via LaTex, this means simply not converting $100&nbsp;0$ to \(100&amp;nbsp;0\) but to \(100&nbsp;0\) (just as 100&nbsp;0 is not converted to 100&amp;nbsp;0 either) when converting the Markdown to HTML would be all that I need.

mschilli
  • 1,884
  • 1
  • 26
  • 56

1 Answers1

3

Pandoc expects math environments to contain LaTeX math markup, not HTML. Conversion fails as pandoc tries to output $100&nbsp;000$ as LaTeX, but that gives \(100&amp;nbsp;000\) instead of what you intended.

As a solution, you could try to use the literal narrow no-break space unicode character "" in your hook.

Alternatively, one could use a pandoc lua filter (or possibly a R pandoc-filter) to force pandoc to pass-through math content unaltered:

-- filename: force plain math
function Math (el)
  if el.mathtype == 'DisplayMath' then
    return pandoc.RawInline('html', '\\[' .. el.text .. '\\]')
  else -- InlineMath
    return pandoc.RawInline('html', '\\(' .. el.text .. '\\)')
  end
end

Save to a file and use it by adding

output:
  bookdown::html_document2:
    pandoc_args: --lua-filter=force-plain-math.lua

to your document.

tarleb
  • 19,863
  • 4
  • 51
  • 80
  • 1
    Thank you for your anwer. I added my thoughts regarding your first paragraph to my question. Concerning the first suggestion in the second paragraph, can you give me a code example on how to actually implement that? Would that be done through separate `knitr` hooks? Or With a single one exploiting some pandoc feature using different text depending on the target format? – mschilli Jan 13 '18 at 12:27
  • I'll refine my answer and replace it with something better later. I realized that I misunderstood some parts of your question and had forgotten how knitr hooks work. – tarleb Jan 13 '18 at 13:07
  • 1
    Thx, +1 + accepted as a literal ` ` seems to work in my case. I'll have to check if MathJax can handle the narrow version, too. The filter solution looks very promising, too, but I cannot test it as this seems to require pandoc >=2.0 which I don't have available on any of the systems at my disposal to test right now. – mschilli Jan 13 '18 at 15:11
  • 1
    Just a quick update: I installed pandoc 2.1 using [the deb file](https://github.com/jgm/pandoc/releases/download/2.1/pandoc-2.1-1-amd64.deb) and `apt-get install -f ./pandoc-2.1-1-amd64.deb` on my Ubuntu 16.04 machine. After added a (now required) `title: test` to the YAML header of my MWE above, I could successfully use your lua filter, too. – mschilli Jan 15 '18 at 08:43