I have a dataframe I am trying to convert to rdf to edit in
Protege. The dataframe unfortunately has ASCII codes that are not visible when the strings are printed, most notoriously \u0020
, which its he code for a space.
x <- "\u0020".
x
> " "
grepl()
works fine when searching for the pattern,
but does not return the original string when the
result is printed.
match <-
grep(pattern = "\u0020", x = x, value = TRUE)
match
> " "
The problem is that these codes are throwing Protege off and I'm trying to normalize them to basic characters such as \u0020
to " "
, but I cannot find any regex that will catch these and replace them with the single non-code character. The regex pattern [^ -~]
does not catch these values and I'm completely blind to these strings otherwise. How can I normalize any of these codes in R?