RStudio knitr markdown, German umlaut causing text to turn green in code chunks

Question

I'm having issues preparing an rmarkdown document in RStudio.
I'm importing a German data set that includes the umlaut "ü". When reading the table into RStudio I have to include the umlaut in a string.

The document is produced without any issues aside from the fact that after the ü, the text becomes the inverse of the color it should be. I created a MWE that reproduces the problem.
In the MWE the first chunk renders as I expect, however in the second chunk, after the word 'lücky' the remaining string elements are black.

Is there a way to avoid this?

---
output: pdf_document
---

## MWE
When I use a normal 'u' in lucky everything looks fine
```{r }
a <- c('dog', 'cat', 'rabbit', 'lucky', 'pig', 'sheep', 'goat')
```

When I use a German 'ü' in lucky, the green text is the inverse of as it should be
```{r }
a <- c('dog', 'cat', 'rabbit', 'lücky', 'pig', 'sheep', 'goat')
```

Update with sessionInfo() and options('encoding') :

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] compiler_3.5.1  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2 htmltools_0.3.6
 [6] tools_3.5.1     yaml_2.2.0      Rcpp_0.12.18    stringi_1.2.4   rmarkdown_1.10 
[11] knitr_1.20      stringr_1.3.1   digest_0.6.16   evaluate_0.11

> options('encoding')
$`encoding`
[1] "native.enc"

I'm having a hard time reproducing this because (I believe) I don't have the same system encoding you have. I don't know precisely what is different, but can you share the output from `sessionInfo()` and `options("encoding")`? — r2evans, Nov 05 '18 at 21:27
I have tried using a few different encoding options, but it changed the umlaut to gibberish, which caused errors — Socadillo, Nov 05 '18 at 21:58
Ok, I confirmed one thing ... my copy/paste mangled the umlaut despite my intentions. So now I can reproduce it ... — r2evans, Nov 05 '18 at 22:00
Oddly enough, `ä` and `ö` work as they should. Only `ü` is causing the issue. — Socadillo, Nov 05 '18 at 23:18
Interesting question. I can reproduce this with the "knit" button in RStudio. Cannot reproduce with `rmarkdown:render()`, but `render` turns the "ü" into gibberish (file encoding: UTF-8, ISO8859-1 and Windows-1252 make no difference). — CL., Nov 06 '18 at 07:18
The following link is a little old but might add a clue, as you are on windows [Unicode with knitr and Rmarkdown](https://stackoverflow.com/questions/44153072/unicode-with-knitr-and-rmarkdown). — steveb, Nov 06 '18 at 07:24
I've tried several encoding methods. When rendering a PDF the text turns green, although it is not an issue when making HTML. The strange thing is it only occurs with `ü` — Socadillo, Nov 07 '18 at 19:21

score 0 · Answer 1 · answered Oct 02 '19 at 07:41

I used pdflatex as a LaTeX engine to reproduce this strange effect. Additionally i've marked as TRUE the option Keep tex source used to produce PDF. Strange effect eas reproducable and inside the the text source I found the reason:

When I use a normal `u' in lucky everything looks fine

\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{a <-}\StringTok{ }\KeywordTok{c}\NormalTok{(}\StringTok{'dog'}\NormalTok{, }\StringTok{'cat'}\NormalTok{, }\StringTok{'rabbit'}\NormalTok{, }\StringTok{'lucky'}\NormalTok{, }\StringTok{'pig'}\NormalTok{, }\StringTok{'sheep'}\NormalTok{, }\StringTok{'goat'}\NormalTok{)}
\end{Highlighting}
\end{Shaded}

When I use a German `ü' in lucky, the green text is the inverse of as it
should be

\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{a <-}\StringTok{ }\KeywordTok{c}\NormalTok{(}\StringTok{'düg', '}\NormalTok{cat}\StringTok{', '}\NormalTok{rabbit}\StringTok{', '}\NormalTok{löcky}\StringTok{', '}\NormalTok{pög}\StringTok{', '}\NormalTok{sheep}\StringTok{', '}\NormalTok{goat}\StringTok{')}
\end{Highlighting}
\end{Shaded}

The ü occurs tag changing from expected StringTok into NormalTok to all following strings. That's why the format changed.

So from my point of view it's related to the rendering engine.

RStudio knitr markdown, German umlaut causing text to turn green in code chunks

1 Answers1