3

Define:

df <- data.frame(name=c("México","Michoacán"),dat=c(1,2))

s.t.

> df
        name dat
1    México   1
2 Michoacán   2

When I print this table to a .tex file using xtable the accented characters get garbled, which is no surprise.

I would like to replace accents with proper Latex formatting e.g.:

> df
     name dat
1 M\'{e}xico   1
2 Michoac\'{a}n   2

Please note in real dataset there are many different names with different accented letters but all with same type of accent (i.e. foward-slash), so the only thing that needs to change in \'{.} is the letter in place of the dot.

In trying one reader's suggestion i did the following:

> df <- data.frame(name=c("México","Michoacán"),dat=c(1,2))
> df
        name dat
1    México   1
2 Michoacán   2
> df$name <- sub("é", "\\\\'{e}", df$name,)
> df
         name dat
1 M\\'{e}xico   1
2  Michoacán   2
> capture.output(
+       print(xtable(df)),
+       file = "../paper/rTables.tex", append = FALSE)

When I opened the rTables.tex file in Notepad:

% latex table generated in R 2.13.1 by xtable 1.5-6 package
% Fri Jul 15 13:19:17 2011
\begin{table}[ht]
\begin{center}
\begin{tabular}{rlr}
  \hline
 & name & dat \\ 
  \hline
1 & M$\backslash$'\{e\}xico & 1.00 \\ 
  2 & Michoacán & 2.00 \\ 
   \hline
\end{tabular}
\end{center}
\end{table}

This is not what is needed.

Fred
  • 1,833
  • 3
  • 24
  • 29

2 Answers2

2

Use the stringr package, and replace each type of accented character one at a time.

library(stringr)
df$name <- str_replace_all(df$name, "é", "\\\\'{e}")  
df$name <- str_replace_all(df$name, "á", "\\\\'{a}")
df$name
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • Thanks, could loop over the vowels a,e,i,o,u since only vowels have accents – Fred Jul 14 '11 at 18:58
  • What you suggest works with the example provided. However I am using a third party provided database and the accents must be coded differently so it does not replace anything :-( Any idea how to determine the encoding and deal with it? – Fred Jul 14 '11 at 20:28
1

I think the problem is that this case is asking a lot of xtable's attempts to convert strange characters to LaTeX. Try overriding sanitize.text.function as follows:

print(xtable(df),sanitize.text.function=function(x){x})

which on my system outputs this:

% latex table generated in R 2.13.0 by xtable 1.5-6 package
% Fri Jul 15 10:30:00 2011
\begin{table}[ht]
\begin{center}
\begin{tabular}{rlr}
  \hline
 & name & dat \\ 
  \hline
1 & M\'{e}xico & 1.00 \\ 
  2 & Michoacán & 2.00 \\ 
   \hline
\end{tabular}
\end{center}
\end{table}

It might be that other LaTeX markup may be broken by doing this, though, so be aware of that.

joran
  • 169,992
  • 32
  • 429
  • 468