So, I found a bug in glibc that I like to report. The issue is that printf()
counts the wrong width for a grouping character in the no_NO.utf8
locale and thus does not set aside enough padding to the left of the string. I originally spotted this in the shell util printf
, but it seems it originates from the original printf
function in libc
, which I have verified using a little test program.
I haven't dealt in C since university, so I am a bit rusty when creating a test case. My only issue so far is that when using this grouping char as part a string (a wchar_t array), the string is not terminated, and I am not sure what I am doing wrong.
This is the output of my little test driver:
$ gcc printf-test.c && ./a.out
Using locale nb_NO.utf8
<1 234> (length 7 according to strlen)
<1 234> (length -1 according to wcswidth)
Using locale en_US.utf8
< 1,234> (length 7 according to strlen)
< 1,234> (length 7 according to wcswidth)
Width of character e280af: -1
Width of s0 4: (ABCD)
Width of s1 4: (ABCD)
Width of s2 -1: (
As is obvious, something fishy is going on with the printing in the final string and it is somehow related to how I try to print a string with the multi-byte grouping character used in the nb_NO
locale.
The full source:
#define _XOPEN_SOURCE /* See feature_test_macros(7) */
#include <wchar.h>
#include <stdio.h>
#include <locale.h>
#include <string.h>
void print_num(char *locale){
printf("Using locale %s", locale);
setlocale(LC_NUMERIC, locale);
char buf[40];
sprintf(buf,"%'7d", 1234);
printf("\n<%s> (length %d according to strlen)\n", buf, (int) strlen(buf));
wchar_t wbuf[40];
swprintf(wbuf, 40, L"%'7d", 1234);
int wide_width = wcswidth (wbuf, 40);
printf("<%s> (length %d according to wcswidth)\n", buf, wide_width);
puts("");
}
int main(){
print_num("nb_NO.utf8");
print_num("en_US.utf8");
// just trying to understand
wchar_t wc = (wchar_t) 0xe280af; // is this a correct way of specifying the char e2 80 af?
int width = wcwidth (wc);
printf("Width of character %x: %d\n", (int) wc, width);
wchar_t s0[] = L"ABCD";
wchar_t s1[] = {'A','B','C', 'D', '\0'};
wchar_t s2[] = {'A',wc,'B', '\0'}; // something fishy
int widthOfS0 = wcswidth (s0, 4);
int widthOfS1 = wcswidth (s1, 4);
int widthOfS2 = wcswidth (s2, 4);
printf("\nWidth of s0 %d: (%ls)", widthOfS0, s0);
printf("\nWidth of s1 %d: (%ls)", widthOfS1, s1);
printf("\nWidth of s2 %d: (%ls)", widthOfS2, s2); // this does not terminate the string
return 0;
}