Use utf-8 character for curses border

Question

I can't use █ character for curses border with.

Code example:

import curses

stdscr = curses.initscr()
c = '█'
stdscr.border(c, c, c, c, c, c, c, c)
stdscr.getch()

I obtain this error:

OverflowError: byte doesn't fit in chtype

I can however use addstr for write some utf-8 character like this:

stdscr.addstr(0, 0, "█")

Thanks for your help.

score 4 · Accepted Answer · answered Feb 24 '20 at 10:52

The problem is that the Python curses package is just a wrapper over the ncurses C library. And in ncurses (https://linux.die.net/man/3/ncurses), a character is represented as a chtype, (character and attribute data) where the character is a C char type which is just a byte on common systems.

The underlying border function expects each character of the border to be a single byte, while the 'FULL BLOCK' that you are trying to use is the unicode character U+2588 or the UTF-8 byte string b'\xe2\x96\x88'. That is the reason for the error message: you try to store a 3 bytes sequence into a single byte variable.

It works fine for addstr because that function expects a string and accepts the 3 bytes sequence. But it would break with addch which expect a single string.

Said differently, the curses module will not accept multibyte UTF-8 sequences except where it expects strings.

Possible workarounds:

the recommended way to use the underlying ncurses library and the curses Python module is to find a single byte encoding matching your requirements. Latin1 (ISO-8859-1) is even the default for ncurses, but other encodings could better meet your needs.
find (or write) a Python wrapper around ncursesw. This is a variant of ncurses using wide (16 bits) characters. It would accept gladly 0x2588 as a character value, or more generally any character having a 16 bits only code point, which is just the Basic Multilingual Plane of Unicode. Unfortunately, I know none.

> find a single byte encoding matching your requirements . Could you please explain how to find, and use such an encoding format? I added this line: `stdscr.encoding = "utf_8"` in my program, and it still gives me the same error. — Anchith Acharya, Sep 25 '20 at 15:59
@AnchithAcharya: This answer says that utf_8 is not suitable. If you use a Western European language, Latin1 (or ISO-8859-1) is fine. But you should read the Wikipedia page on [ISO-8859](https://fr.wikipedia.org/wiki/ISO/CEI_8859) because it covers a large number of languages.. — Serge Ballesta, Sep 26 '20 at 08:23
Oh my bad. The page I referred to was [this](https://docs.python.org/2.4/lib/standard-encodings.html), which described utf_8 as supporting all languages. I must have misunderstood it. Anyway, if it helps, I am only trying to print the block characters and box-bounding characters. — Anchith Acharya, Oct 02 '20 at 16:41

Use utf-8 character for curses border

1 Answers1