I am trying to use the iconv
function in R to achieve the correct transliteration of German words (for example, Möbel → Moebel).
I have written the following code (tried with English/German locales):
iconv("Möbel", "latin1", "ASCII//TRANSLIT")
[1] "Mobel"
iconv("Möbel", "UTF-8", "ASCII//TRANSLIT")
[1] NA
iconv("Möbel", "UTF-8", "ASCII//TRANSLIT", sub ="")
[1] "Mbel"
iconv("Möbel", "Windows-1252", "ASCII//TRANSLIT")
[1] "Mobel"
However, this is not working properly. Here is the output of some of my tests:
#cat + library(ds4psy)
iconv(cat ("M", Umlaut["o"],"bel", sep = ""), "latin1", "ASCII//TRANSLIT")
Möbelcharacter(0)
#paste/paste0 + library(ds4psy)
> iconv(paste ("M", Umlaut["o"],"bel", sep = ""), "latin1", "ASCII//TRANSLIT")
[1] "MA?bel"
For completeness, I also tried the function stri_trans_general
from stringi
:
stri_trans_general("Möbel", "latin-ascii")
[1] "Mobel"
but, as you can see, this didn't work, either.
What I don't understand is why the iconv
function is not working properly in R when it clearly works correctly in PHP:
<?php
//some German
$utf8_sentence = 'Weiß, Goldmann, Göbel, Weiss, Göthe, Goethe und Götz';
setlocale(LC_ALL, 'de_DE');
$trans_sentence = iconv('UTF-8', 'ASCII//TRANSLIT', $utf8_sentence);
//gives [Weiss, Goldmann, Goebel, Weiss, Goethe, Goethe und Goetz]
echo $trans_sentence . PHP_EOL;
?>
Why am I seeing this difference in behavior with the iconv
version in R vs. PHP? What am I doing wrong in my R code?