If you only have a few specific symbols to change, it would be easiest to use their Unicode code points. For example, to change all occurences of the registered trademark symbol (Unicode +U00AE) to the equivalent html entity (®
), and any degree symbols (+U00B0) to the entity °
, we can do:
special_char <- function(df) {
mutate_all(df, .funs = ~ str_replace_all(.x,
c("\u00ae", "\u00b0"),
c("®", "°")))
}
So, if your data frame looks like this:
data <- data.frame(a = c("Stack Overflow®", "451°F"),
b = c("Coca Cola®", "22°F"))
#> a b
#> 1 Stack Overflow® Coca Cola®
#> 2 451°F 22°F
Your function will escape all relevant instances:
data %>% special_char()
#> a b
#> 1 Stack Overflow® Coca Cola®
#> 2 451°F 22°F
If you want all non-ASCII characters encoded to html entities, a more general solution would be to use the numerical entity format. This is less human-readable, but probably the go-to option if you have a lot of different symbols to escape. A useful starting point would be Mr Flick's solution here, though you would need to vectorize this function to get it working with data frame columns.