1

How to handle Cyrillic strings in R?

Sys.setlocale("LC_ALL","Polish")

dataset <- data.frame( ProductName = c('ąęćśżźół','тест') )

#Encoding(dataset) <- "UTF-8" #this line does not change anything

View(dataset)

The code above results in: enter image description here

But I would like to get what I typed in тест instead of sequence <U+number>. Is there any way for that?

Przemyslaw Remin
  • 6,276
  • 25
  • 113
  • 191
  • 1
    You can look at [this](https://stackoverflow.com/questions/14691555/cyrillic-encoding-output-in-r) despite my `Sys.setlocale()` is not `"ru_RU"` and your code works fine to me. – s__ Oct 24 '19 at 12:19

1 Answers1

1

This works for me and see the cyrillic test in my data frame. I think you should check what your locale is (with sessionInfo) and whether it supports UTF.

Also check this link and try to maybe change the encoding of your column.

Encoding(dataset$Cyrillic) <- "UTF-8"
Arienrhod
  • 2,451
  • 1
  • 11
  • 19
  • Thank you. Changing encoding is useless. Changing locale to Russian helps only in case of Russian names. But it messes string values of other non Roman letters. I have edited my question to better reflect my situation. Nevertheless I can always split it into parts and use different locale for each part. It is a pity it cannot be done easily. SQL Server for example can handle whatever diacritic barbarian marks in its nvarchar variable. – Przemyslaw Remin Oct 24 '19 at 15:23