2

python and Tkinter are processing Unicode characters correctly.

But they are not able to display Unicode encoded characters correctly.

I am using Python 3.1 and Tkinter in Ubuntu. I am trying to use Tamil Unicode characters.

All the processing is done correctly. But the display is wrong?

Here is the Wrong display as in Tkinter

wrong

Here is the Correct display (as in gedit)

correct


Still not solved:

from tkinter import *
root = Tk()
root.geometry('200x200')
var = StringVar()
label = Label( root, textvariable=var, relief=RAISED )
Entry(text="Placeholder text").pack()
var.set("கற்றதனால் ஆய பயனென்கொல் வாலறிவன்\nநற்றாள்தொழாஅர் எனின்.  ")
label.pack()
root.mainloop()

Manjaro: enter image description here

Windows: enter image description here

Smart Manoj
  • 5,230
  • 4
  • 34
  • 59
Vijay
  • 21
  • 2
  • 1
    I can't answer your question directly, but I'd advise you to drop tkinter and use something modern like PyQt instead. You'll be grateful when your project grows. – static_rtti Mar 02 '11 at 11:02
  • @static_rtti: why? tkinter is a fine language that scales very nicely. – Bryan Oakley Mar 02 '11 at 11:57
  • Are you certain you're using the same font face in both cases? Naturally, if the font you are using doesn't have the glyph it will show up incorrectly, and not all fonts have all unicode characters. – Bryan Oakley Mar 02 '11 at 12:02
  • OP said in answer | @Bryan Oakley I do not think the font is the problem here, but its rendering is. For example, when I type two Unicode characters u0BAE and u0BC6, it should be combined as a single Tamil character displaying "மெ". But I think the rendering engine is not present in Tkinter for displaying some Unicode languages. – Smart Manoj Jun 11 '21 at 14:51

3 Answers3

0

As per this comment,

from PyQt5.QtWidgets import QApplication,QMainWindow,QLabel
import sys
app=QApplication(sys.argv)
app.setStyle('Fusion')
app.setApplicationName('PyQt5 App')
win=QMainWindow()
label=QLabel()
text='கற்றதனால் ஆய பயனென்கொல் வாலறிவன்\nநற்றாள் தொழாஅர் எனின்.'
label.setText(text)
win.setCentralWidget(label)
win.show()
sys.exit(app.exec_())

enter image description here

Smart Manoj
  • 5,230
  • 4
  • 34
  • 59
0

I had faced similar problems and discovered I used the Zero Width Joiner (U+200D) to explicitly tell the rendering engine to join two characters. That used to work in 2010 but looks like there have been changes in the rendering engine (that I am now aware of) and now in 2011 I find that having the joiner creates the problem ! (It broke my working code) I had to remove the explicit zero width joiners to have my code work again. Hope this helps.

tukamhane
  • 9
  • 1
-1

It looks like Tk is mishandling things like 'Class Zero Combining Marks', see: http://www.unicode.org/versions/Unicode6.0.0/ch04.pdf#G124820 (Table 4-4)

I assume one of the sequences that do not show correctly are the codepoints: 0BA9 0BC6 (TAMIL SYLLABLE NNNE), where 0BC6 is a reordrant class zero combining mark according to the Unicode standard, which basically means the glyphs get swapped.

The only way to fix it is to file a bug at the Tk bug tracker and hope it gets fixed.

schlenk
  • 7,002
  • 1
  • 25
  • 29