N-curses within Python : how to catch and print non ascii character?

Question

I want to make a small program with ncurses/python and to be able to use/type in french and japanese. I understand that I should set the locale and use unicode standard.

But how to deal with the result from screen.getch() ? I would like to display the typed character within the ncurses window regardless of the language.

I understand that some unicode conversion is necessary but can't find what to do (and i've searched quite a bit : this character conversion bussiness isnt easy to understand for amateurs).

Additional question : it seems that for non ascii characters we must used addstr() instead of addch(). Similarly should I use getstr() instead of getch() ?

#!/usr/bin/python3
# -*- coding: utf-8 -*-
import curses
from curses import wrapper
import locale

locale.setlocale(locale.LC_ALL, '')

def main(scr):
    # Following lines are some sort of "proof of concept"
    # Indeed it print latin or japanese characters allright
    scr.addstr(0, 0, u'\u3042'.encode('utf-8')) # print あ
    scr.addstr(1, 0, 'é'.encode('utf-8'))       # print é

    # But here I would like to type in a character and have it displayed onscreen
    while (True):
        car = scr.getch()
        if car == 27: # = Escape key
            break
        else:
        # What should I put between those parenthesis to
        # print the typed character on the third line of the screen 
            scr.addstr(3, 0, ???? )

wrapper(main)

It looks like you are using Linux or an other Unix-like (not Windows). Can you confirm it, and can you confirm that you do not need Windows compatibility? — Serge Ballesta, May 30 '19 at 07:35
Indeed I am using linux (debian based distribution called bunsen labs) and I do not need windows compatibility. — Tapewormer, May 31 '19 at 08:44

score 3 · Answer 1 · edited Jun 20 '20 at 09:12

unctrl is the function to use, for results from getch:

curses.unctrl(ch)

Return a string which is a printable representation of the character ch. Control characters are displayed as a caret followed by the character, for example as ^C. Printing characters are left as they are.

If you want to read UTF-8 directly, use get_wch (which was not available in the python2 wrapper):

window.get_wch([y, x])

Get a wide character. Return a character for most keys, or an integer for function keys, keypad keys, and other special keys. In no-delay mode, raise an exception if there is no input.

New in version 3.3.

even with that, you still must ensure that the locale is initialized. The Python documentation assumes that you have access to the ncurses documentation:

Initialization, in the ncurses manual page
get_wch, wget_wch, mvget_wch, mvwget_wch, unget_wch- get (or push back) a wide character from curses terminal keyboard

Sorry to say that "é" that should count as a printing character is not left as it is. When printed it becomes M-) . Is there an extra conversion step to this ? — Tapewormer, May 31 '19 at 08:51
sure - you have to tell python what the locale is -- otherwise, that character isn't printable in the POSIX (default) locale. — Thomas Dickey, May 31 '19 at 09:02

score 0 · Answer 2 · answered May 31 '19 at 07:51

0

getch/getkey are broken in Python. They are supposed to return a character when encoding is set up as documented, but instead they return octets from a UTF-8 sequence one by one each time the function is called. You need to work around the defect by collecting the octets in a loop until you have a complete sequence. A sequence is complete when it can be successfully decoded, otherwise incomplete.

Compare with the following program which works just fine (run with perl -C so-56373360.pl):

use Term::ReadKey qw(ReadKey ReadMode);
ReadMode 'raw';
while () {
    my $c = ReadKey 0;
    last if $c eq "\e"; # Escape
    print $c;
}
ReadMode 'restore';

answered May 31 '19 at 07:51

daxim

39,270
4
65
132

Thanks to your comment I started to understand what to look for ! – Tapewormer May 31 '19 at 10:00
Indeed with a very small script I could verify that typing "qwer" calls getch() 4 times while typing "bépo" calls it 5 times (and issue two ASCII like codes for the "é"). Was trying to figure how to loop trough that to obtain octet to build a valid utf-8 code while in the meantime I realised that in my particular case using addch() to print instead of addstr() solved my issue. – Tapewormer May 31 '19 at 10:09

N-curses within Python : how to catch and print non ascii character?

2 Answers2

Linked