1

Trying to get full support for UTF8 under ncurses. I built ncurses 5.9 with wide character support. If I have a utf8 string as in: D0 9D D0 BE D0 B2 D1 8B D0 B9 will output Новый in the normal console with printf. If I start ncurses and use waddstr I get only some of the characters as in: ?~]ов?~Kй. Why is it not working and what significance is the embedded escape sequences ~] and ~K ?

I am including wide version of header and linking with the built wide libraries.

This is built using open watcom (static linked), it turns out OW just always uses "C" locale, so rebuilt the libraries removing HAVE_LOCALE_H (which allows ncurses to pull in the information from environment variables in this case LANG='en_US.UTF-8'). So now that gets rid of the ~] and ~K but still get ? ов? й (and now \n for new lines doesn't work).

So this has morphed in to two additional questions:

1 ) What does ncurses require of the c library? (Apparently ncurses is handling some multi-byte characters just fine, but still problems with those two)

2) Why did the newline character stop working?

TIA!!

user3161924
  • 1,849
  • 18
  • 33
  • The answer is a bug in ncurses. If `HAVE_WCTOB` is not defined but `HAVE_WCTOMB` is (and better be, missing `#error` as result is undefined in that case) then when `_nc_is_charable` is callled in `charable.c` it calls `_nc_to_char` which then uses the `wctomb` function and assumes the returned mb string will have a `z-term` but it may not (and didn't), so the `strlen` check is no good. – user3161924 Jun 09 '18 at 06:38
  • when is this thing going to allow me to add the final answer? – user3161924 Jun 09 '18 at 17:16

1 Answers1

2

Perhaps you forgot to set the locale:

setlocale(LC_ALL, "");

as noted in the Initialization section of the manual page.

Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105
  • That's already done as well. But using `setlocale(LC_CTYPE, "");` because I don't want the other locale items changed, unless the user chooses to. – user3161924 Jun 08 '18 at 21:42
  • If the other locale variables are compatible (same UTF-8 encoding), that generally works. But if for example `LC_ALL` uses ISO-8859-1, the C library gives less useful results. – Thomas Dickey Jun 08 '18 at 21:59
  • Please see the original message edited since it recommended I do that as it's not a duplicate. – user3161924 Jun 08 '18 at 22:29
  • I noticed that (and also that the approved answer for the suggested duplicate is incorrect). That's why I added a clarification in a comment. – Thomas Dickey Jun 08 '18 at 23:36
  • We'll it was the compiler as it provides the mb functions but they don't do anything. So I had to add my own to that ncurses build and remap them to those version (prefixed them with _nc_) but, I can't answer my own question on that side. But do have a new question, that being now that it all works, the newline character `\n` no longer works?? Any idea on that (and how to put the answer) ? – user3161924 Jun 09 '18 at 02:29
  • so for kicks i put back `HAVE_LOCALE_H` (trying to override `LANG`, `LC_TYPE`, and `LC_ALL` didn't work, but did note a little bug in the logic of `lib_setup`, it should check `LC_CTYPE` before `LC_ALL`) and while the characters are wrong as expected, the `\n` and `\r` work fine (as I would expect them to even with UTF8) ?? – user3161924 Jun 09 '18 at 03:39
  • A compilable/testable program would help - cut/paste of strings from this website isn't very helpful. – Thomas Dickey Jun 09 '18 at 11:04
  • The answer is in the comments of the question, the real answer can't be given as an actual answer. – user3161924 Jun 23 '18 at 20:31
  • You'd get better results here - https://lists.gnu.org/mailman/listinfo/bug-ncurses – Thomas Dickey Jun 23 '18 at 21:01