Can Code page and locale differ?

Question

On Windows 7 (64 bit) I set Japan as the locale. On command prompt following is reflected: LC_ALL: English_United States.1252 LC_CTYPE: English_United States.1252 chcp command: Active Code Page: 932 (which is Japanese)

My question is while converting wchar_t* to char* using ICU library what should be default converter used? In this case "US-ASCII" is used and I am getting garbage result in char*. Input: wchar_t* contains Japanese characters

I'm not sure I understand your question? Are you asking if ICU is behaving correctly, are you asking how to make ICU behave the way you want, or are you asking how to properly convert between char sets on windows? — Edward Thomson, Dec 13 '13 at 13:01
I will rephrase in simple way. On Windows code page= 932, ICU API ucnv_getDefaultName() returns "US-ASCII". Is this correct? — user1989504, Dec 14 '13 at 09:53

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

1

From the ICU documentation:

Depending on system design, setup and APIs, it may not always be possible to find a default codepage that fully works as expected...

If you have means of detecting a default codepage name that are more appropriate for your application, then you should set that name with ucnv_setDefaultName() as the first ICU function call. This makes sure that the internally cached default converter will be instantiated from your preferred name.

edited Jun 20 '20 at 09:12

Community

1
1

answered Dec 19 '13 at 16:58

NuSkooler

5,391
1
34
58

Thank you. I got the issue. I was using "sbcs" version of ICU library instead of "full" version and that is why was getting "US-ASCII" converter on Japanese locale. – user1989504 Dec 23 '13 at 08:13

Can Code page and locale differ?

1 Answers1