I am processing SPSS data from a questionnaire that must have originated in M$ Word. Word automatically changes hyphens into long hyphens, and gets converted into characters that don't display properly, i.e. "-" turns into "ú".
My question: What is the equivalent to utf8ToInt() in the WINDOWS-1252 character set?
utf8ToInt("A")
[1] 65
When I do this with my own data, I get an error:
x <- str_sub(levels(sd$j1)[1], 7, 7)
print(x)
[1] "ú"
utf8ToInt(x)
Error in utf8ToInt(x) : invalid UTF-8 string
However, the contents of x are perfectly usable in grep and gsub expressions.
> Sys.getlocale()
[1] "LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252"