how to convert Arabic numbers in character class to English numeric in R?

Question

I have a character data frame which its first column contains Arabic/Persian numbers. Actually, the class of these numbers is "character". How can I convert them to English numeric in order to do some calculation with them?

Please give a [mcve]. For example, perhaps you could include the output of `dput(head(df))` (where `df` is your dataframe). Or, better yet, just include part of a vector which contains such numbers. — John Coleman, Apr 28 '18 at 13:43

score 6 · Accepted Answer · answered Apr 28 '18 at 14:54

It seems to be mostly a question of character mappings.

Not extensively tested, but the following seems to work, at least for Persian number strings.

persian <- "\u0660\u0661\u0662\u0663\u0664\u0665\u0666\u0667\u0668\u0669\u06F0\u06F1\u06F2\u06F3\u06F4\u06F5\u06F6\u06F7\u06F8\u06F9"
english <- "01234567890123456789"
persian.tonumber <- function(s) as.numeric(chartr(persian,english,s))

For example,

> persian.tonumber("٢٣٤٥")
[1] 2345

I obtained the Unicode from this answer. You could extend the translation vectors if need be to include Arabic symbols (if they aren't already covered by the Persian symbols -- I am not really familiar with the system that you are referring to).

how to convert Arabic numbers in character class to English numeric in R?

1 Answers1