4

Is there a standard macro to convert a char string literal that is substituted with a preprocessor macro to a wchar_t string using Visual Studio 2005?

e.g. this works:

wchar_t* test = L"Hello World";

but this doesn't:

#define HELLOWORLD "Hello World"
wchar_t* test = L(HELLOWORLD);

I am sharing a header file containing many internationalised strings with several different projects on different platforms, so I don't want to alter the header itself, nor add _T() since this is platform dependent. So I want the unicode conversion to be in my source, not in the header. I know I can write my own substitution macro as shown here, but am wondering if there is a standard method in VS?

This would seem a very common scenario for anyone writing internationalised code, but I can't seem to find a predefined macro delivered with VS2005.

Community
  • 1
  • 1
Piers
  • 723
  • 1
  • 6
  • 18

2 Answers2

3

I had a similar problem and I came up with this solution:

#define HELLOWORLD "Hello World"
const wchar_t* test = L"" HELLOWORLD;

I tested this with GCC 7, where I had to make sure the conversion (wide to char) functions work.

I'm pretty sure this will also work with other compilers. Even with older ones, as the functionality is required to write multi-line comments. But maybe it will complain about the use of an empty string literal.

0

No, there is no "standard" macro for this. The standard is to prefix wide strings with an L and omit the prefix from narrow strings.

The only reason why such a macro even needs to exist is when you're variously targeting platforms that don't support Unicode. In that case, everything needs to be a narrow string. If you're not dealing with platforms that lack Unicode support, then everything should probably be a wide string all the time.

The _T and TEXT macros are provided in the Windows headers for precisely this purpose: maintaining a single code base that can be compiled both for Windows NT, which supports Unicode, and Windows 9x, which lacks Unicode support.

I can't imagine why you would need such a macro if you aren't already including the Windows headers, but if you do, it's pretty simple to write one yourself. Except than you're going to need to know when string literals should be wide strings and when they should be narrow strings. There's no "standard" #define for this, either. The Windows headers use the UNICODE pre-processor symbol, but you can't rely on this being defined on other platforms. So now you're back to where you started from.

Why do you think you need a macro for this, again? If you're hardcoding the type of the string as wchar_t*, then you're always going to want to use a wide character literal, so you always want to use the L prefix.

When you're using the _T and/or TEXT macros from the Windows headers, you're also not hard-coding the type of the string as wchar_t*. Instead, you're using the TCHAR macro, which automatically resolves to the appropriate character type depending on the definition of the UNICODE symbol.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
  • Just to clarify, the "other platform" in this case is compiled for using a very old C compiler that has limited support for preprocessor token concatenation, so I want to leave the header alone as much as possible. It does not include any Windows headers, so the other platform is effectively not Windows at all. The source itself can be platform-specific, which is why I want the conversion to unicode in the source and not the shared header. – Piers Aug 09 '12 at 02:34
  • The way you describe your other platform, wchar_t probably isn't Unicode. It's probably a national character set (xSCII for some x that isn't A). If you need wchar_t, use the ## operator which ought be be available even in limited preprocessor token concatenation. If you need Unicode, you'll need to copy or develop some libraries. – Windows programmer Aug 09 '12 at 02:47
  • Not sure I understand... Adding an `L` to wide string literals isn't going to mess up the preprocessor. The *compiler* knows what to do with these prefixes. It only involves the preprocessor if you use a macro, which is what I'm advising you *not* to do. (Although as the other commenter points out, even the most limited preprocessor should be able to handle token pasting. I just don't see how that's going to help you if you know that these string literals are going to be stored in a wide character type. Narrow string literals won't *ever* work. You don't need things to be conditional.) – Cody Gray - on strike Aug 09 '12 at 03:17
  • 1
    In the original question, Piers asked for both wchar_t and Unicode. In a comment on your answer here, Piers says the compiler is ancient, so it probably predates Unicode. Prepending L will produce an array of wchar_t but it probably won't be Unicode. The answer depends on information that hasn't been stated. – Windows programmer Aug 09 '12 at 03:36
  • 1
    @Windows The Unicode part is actually irrelevant, it's just confusing things. The `L` doesn't mean Unicode, it means "wide character" to match the "wide character" type, `wchar_t`. The fact that that type is generally used to store Unicode characters isn't relevant. You can't assign a narrow string to a wide character type. The following code won't compile: `const wchar_t* = "Test";` – Cody Gray - on strike Aug 09 '12 at 03:49
  • I just discovered that my older platform C compiler doesn't support ## for tokenisation, so there is no solution for this problem even if there was a standard Windows macro for substituting L conditionally from the source. I'm switching everything over to UTF-8 to keep the File I/O code largely the same, but converting all of the predefined strings as wide chars. To answer the original question "Why do I think I need this macro" again, it was because I didn't want to change the original header file. The header is shared amongst different platforms, not the source itself. – Piers Aug 15 '12 at 08:25