I am looking at twitter data which I am then feeding into an html document. Often the text contains special characters like emojis that aren't properly encoded for html. For example the tweet:
If both #AvengersEndgame and #Joker are nominated for Best Picture, it will be Marvel vs DC for the first time in a Best Picture race. I think both films deserve the nod, but the Twitter discourse leading up to the ceremony will be
would become:
If both #AvengersEndgame and #Joker are nominated for Best Picture, it will be Marvel vs DC for the first time in a Best Picture race. I think both films deserve the nod, but the Twitter discourse leading up to the ceremony will be 🔥 🔥 🔥
when fed into an html document.
Working manually I could use a tool like https://www.textfixer.com/html/html-character-encoding.php to encode the tweet to look like:
If both #AvengersEndgame and #Joker are nominated for Best Picture, it will be Marvel vs DC for the first time in a Best Picture race. I think both films deserve the nod, but the Twitter discourse leading up to the ceremony will be "�";"�"; "�";"�"; "�";"�";
which I could then feed to an html document and have the emojis show up. Is there a package or function in R that could take text and html encode it similarly to the web tool above?