1

As I have found from other questions here on StackOverflow (see here) there is a bug in Tkinter when using Unicode for emojis.

I have implemented the function that fixes the displaying of Unicode so I can display emojis just fine using Tkinter. However, Tkinter still throws an exception when I try to get text from an entry (text box) widget that contains an emoji saying that it is unable to decode the utf-8 string.

I suspect I might be able to get around this using tk.call to access the underlying Tcl interpreter directly since the other question makes me think the bug is in Tkinter and not Tcl. I do not know any Tcl have have failed to find any documentation on how to use tk.call... Am I going down the right path or is there a better solution?

Here is the stack trace for the crash:


Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Program Files (x86)\Python37-32\lib\tkinter\__init__.py", line 1705, in __call__
    return self.func(*args)
  File "C:/Users/phili/PycharmProjects/pychat\gui.py", line 61, in __send
    self.add_message('You: ' + replace_emoji(self.__msg_entry.get()))
  File "C:\Program Files (x86)\Python37-32\lib\tkinter\__init__.py", line 2682, in get
    return self.tk.call(self._w, 'get')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 0: invalid continuation byte

1 Answers1

2

This is a known bug of very long standing (which has been tricky to fix because it breaks a bunch of assumptions in subtle ways), and is caused by the underlying libraries (Tcl and Tk) only supporting the Basic Multilingual Plane of Unicode (which emoji are not on). We hope to have a workaround in place once Tk 8.7 is released.

There may also be issues in Tkinter itself that exacerbate this.


I've looked into this more, and it does appear to be in place for 8.7, as can be seen. I'm not 100% sure if it is there in Tk yet (the branch timeline indicates perhaps not) but that's not an API change so it shouldn't be hard; there are some caveats there but I believe it will be sorted completely for 8.7.

There's also some progress being made on this for 8.6 too. However, it uses a non-standard build mode for Tcl (defining TCL_UTF_MAX=4) so it is unlikely to be generally usable for Tkinter in 8.6, whereas in 8.7 it will be practical for Tkinter to build on top of it (and the bugs will become their problem).

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
  • Thank you! Is there anything I can do as a work around? I'm working on a school project that is written in Python and must support emojis. I would hate to have to rip out the whole UI code and rewrite it with a different library only for emojis. – user8636730 Apr 02 '20 at 15:07
  • Awkward. The time schedules for this (the maintainers of Tcl and Tk all have other jobs too) are not conducive to tight delivery schedules, and who knows what bugs lie in the Python/Tkinter part? – Donal Fellows Apr 03 '20 at 16:10