1

While trying to convert some existing code to support unicode characters this problem popped up. If i try to pass a unicode character (in this case im using the euro symbol) into any of the *wprintf functions it will fail, but seemingly only in xcode. The same code works fine in visual studio and I was even able to get a friend to test it successfully with gcc on linux. Here is the offending code:

wchar_t _teststring[10] = L"";
int _iRetVal = swprintf(_teststring, 10, L"A¥€");

wprintf(L"return: %d\n", _iRetVal);

// print values stored in string to check if anything got corrupted
for (int i=0; i<wcslen(_teststring); ++i) {
    wprintf(L"%d: (%d)\n", i, _teststring[i]);
}

In xcode the call to swprintf will return -1, while in visual studio it will succeed and proceed to print out the correct values for each of the 3 chars (65, 165, 8364).

I have googled long and hard for solutions, one suggestion that has appeared a number of times is using a call such as:

setlocale(LC_CTYPE, "UTF-8");

I have tried various combinations of arguments with this function with no success, upon further investigation it appears to be returning null if i try to set the locale to any value other than the default "C".

I'm at a loss as to what else i can try to solve this problem, and the fact it works in other compilers/platforms just makes it all the more frustrating. Any help would be much appreciated!

EDIT: Just thought i would add that when the swprintf call fails it sets an error code (92) which is defined as:

#define EILSEQ      92      /* Illegal byte sequence */
Argh
  • 11
  • 1
  • 3
  • 1
    Keep your source code as 7-bit ascii and see if that helps. Write `"\u03b2"` in the strings for instance. – bobbogo Feb 03 '11 at 12:45
  • No joy, still fails the same way. I don't think theres any problem creating a string with the unicode characters, if i just initialise the string with "A¥€" and read the values they are all correct, it only breaks when passing through the formatting print functions. – Argh Feb 03 '11 at 13:56

4 Answers4

1

Microsoft had a plan to be compatible with other compilers starting from VS 2015 but finally it never happened because of problems with legacy code, see link.

Fortunately you can still enable ISO C (C99) standard in VS 2015 by adding _CRT_STDIO_ISO_WIDE_SPECIFIERS preprocessor macro. It is recommended while writing portable code.

Billal Begueradj
  • 20,717
  • 43
  • 112
  • 130
1

It should work if you fetch the locale from the environment:

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main(void) {
  setlocale(LC_ALL, "");
wchar_t _teststring[10] = L"";
int _iRetVal = swprintf(_teststring, 10, L"A¥€");

wprintf(L"return: %d\n", _iRetVal);

// print values stored in string to check if anything got corrupted
for (int i=0; i<wcslen(_teststring); ++i) {
    wprintf(L"%d: (%d)\n", i, _teststring[i]);
}

}

On my OS X 10.6, this works as expected with GCC 4.2.1, but when compiled with CLang 1.6, it places the UTF-8 bytes in the result string.

I could also compile this with Xcode (using the standard C++ console application template), but because graphical applications on OS X don't have the required locale environment variables, it doesn't work in Xcode's console. On the other hand, it always works in the Terminal application.

You could also set the locale to en_US.UTF-8 (setlocale(LC_ALL, "en_US.UTF-8")), but that is non-portable. Depending on your goal there may be better alternatives to wsprintf.

Philipp
  • 48,066
  • 12
  • 84
  • 109
  • I'm fairly certain thats one of the combinations i have tried, i can't test in xcode until tomorrow but i will give it a go in the morning. – Argh Feb 03 '11 at 18:54
  • So i gave the code a go, i compiled with GCC directly through the terminal and as you said it works as expected. However it continues to fail as before in xcode. I suppose now the question is how exactly is xcode managing to mangle it? I just finished testing a few different compiler options with no luck so far. – Argh Feb 04 '11 at 09:05
  • @Argh: the problem is that Xcode, as a graphical application, doesn't have the needed locale environment variables. It should work if you run the program compiled by Xcode from the Terminal. – Philipp Feb 04 '11 at 13:35
  • Sorry i probably should have mentioned this at the start but the application is actually an iPhone game. Having the characters print correctly into a console is entirely optional really. The main use would be creating and manipulating strings which would then be read char-by-char using the values to draw the correct glyphs on the screen. vswprintf is the function i really need to get working, but all the *wprintf functions seem to be failing in the same manner so i use swprintf as a simple example. – Argh Feb 04 '11 at 14:38
  • @Philipp: I see what you mean about the console vs terminal, i tried creating a console application in xcode and it does work when run separately while failing when run from xcode. I'm not sure though how this would apply to the iOS app. Perhaps the device is also limited in which locales can be set, am i then doomed to convert everything to objective-c strings? – Argh Feb 04 '11 at 15:06
  • @Argh: You mentioned iOS for the first time. I believe that using Objective-C and Cocoa should be easier because they are the native application development interfaces, but I have no experience regarding iOS at all. – Philipp Feb 04 '11 at 18:14
  • @philipp: Yeah sorry for not making that clear to begin with. At first i thought this was a problem with xcode (perhaps even osx) in general. But with your help i have now discovered that in most other cases it does work as it should. It seems now that the problem is related to either the specific build settings for deploying to iOS or limitations of the software on the device itself. As for resorting to Obj-C, that would be a very worst case scenario, as it would require conversion of a lot of existing code, and we generally try to avoid Obj-C when possible and stick with C/C++. – Argh Feb 04 '11 at 19:09
1

If you are using Xcode 4+ make sure you have set an appropriate encoding for your files that contain your strings. You can find the encoding settings on a right pane under "Text Settings" group.

0

I found that using "%S" (upper case) in the formatting string works.

"%s" is for 8-bit characters, and "%S" is for 16-bit or 32-bit characters.

See: https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Strings/Articles/formatSpecifiers.html

I'm using Qt Creator 4.11, which uses Clang 10.

Pierre
  • 4,114
  • 2
  • 34
  • 39