I'd like to print emojis from python(3) src
I'm working on a project that analyzes Facebook Message histories and in the raw htm data file downloaded I find a lot of emojis are displayed as boxes with question marks, as happens when the value can't be displayed. If I copy paste these symbols into terminal as strings, I get values such as \U000fe328
. This is also the output I'm getting when I run the htm files through BeautifulSoup and output the data.
I Googled this string (and others), and consistently one of the only sites that comes up with them is iemoji.com, in the case of the above string this page, that lists the string as a Python Src. I want to be able to print out these strings as their corresponding emojis (after all, they were originallly emojis when being messaged), and after looking around I found a mapping of src encodings at this page, that mapped the above like strings to emoji string names. I then found this emoji string names to Unicode list, that for the most part seems to map the emoji names to Unicode. If I try printing out these values, I get good output. Like following
>>> print(u'\U0001F624')
Is there a way to map these "Python src" encodings to their unicode values? Chaining both libraries would work if not for the fact that the original src mapping is missing around 50% of the unicode values found in the unicode library. And if I do end up having to do that, is there a good way to find the Python Src value of a given emoji? From my testing emoji as strings equal their Unicode, such as '' == u'\U0001F624'
, but I can't seem to get any sort of relations to \U000fe328