6

I'm developing a tiny Win32 app in C++. I've studied C++ fundamentals long time ago, so now I completely confused because of character strings in C++. There were no WCHAR or TCHAR only char and String. After a little investigation I've decided not to use TCHAR.

My issue is very simple I think, but I can't find clear guide how to manipulate strings in C++. Affected by PHP coding last few years I've expected something simple with strings manipulations and was wrong!

Simply, all I need is to put new data to a character string.

    WCHAR* cs = L"\0";
    swprintf( cs, "NEW DATA" );

This was my first attempt. When debugging my app I've investigated that swprintf puts only first 2 chars to my cs var. I've resolved my problem this way:

    WCHAR cs[1000];
    swprintf( cs, "NEW DATA" );

But generally this trick could fail, because in my case new data is not constant value but another variable, that could potentialy be wider, than 1000 chars long. And my code is looks like this:

    WCHAR cs[1000];
    WCHAR* nd1;
    WCHAR* nd2;
    wcscpy(nd1, L"Some value");
    wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder
    swprintf( cs, "The paths are %s and %s", nd1, nd2);

In this case there is possibility than nd1 and nd2 total character count could be greater than 1000 chars so critical data will be lost.

The question is how can I copy all data I need to WCHAR string declared this way WCHAR* wchar_var; without losing anything?

P.S. Since I'm Russian the question may be unclear. Let me now about that, and I'll try to explain my issue more clear and complex.

Rost
  • 8,779
  • 28
  • 50
Geradlus_RU
  • 1,466
  • 2
  • 20
  • 37

3 Answers3

7

In modern Windows programming, it's OK to just ignore TCHAR and instead use wchar_t (WCHAR) and Unicode UTF-16.

(TCHAR is a model of the past, when you wanted to have a single code base, and produce both ANSI/MBCS and Unicode builds changing some preprocessor switches like _UNICODE and UNICODE.)

In any case, you should use C++ and convenient string classes to simplify your code. You can use ATL::CString (which corresponds to CStringW in Unicode builds, which are the default since VS2005), or STL's std::wstring.

Using CString, you can do:

CString str1 = L"Some value";
CString str2 = L"Another value";
CString cs;
cs.Format(L"The paths are %s and %s", str1.GetString(), str2.GetString());

CString also provides proper overloads of operator+ to concatenate strings (so you don't have to calculate the total length of the resulting string, dynamically allocate a buffer for the destination string or check existing buffer size, call wcscpy, wcscat, don't forget to release the buffer, etc.)

And you can simply pass instances of CString to Win32 APIs expecting const wchar_t* (LPCWSTR/PCWSTR) parameters, since CString offers an implicit conversion operator to const wchar_t*.

Mr.C64
  • 41,637
  • 14
  • 86
  • 162
  • I don't familiar with ATL for now. And also I've decided to use .NET in further apps for Windows. I suppose there is another string routine in .NET, isn't there? – Geradlus_RU Nov 13 '12 at 12:09
  • 2
    You don't have to be familiar with ATL to use `CString`. You can just `#include ` and use `CString` with its convenient features (including loading strings from the application resources). `CString` is better integrated in Win32 programming than `std::wstring`. – Mr.C64 Nov 13 '12 at 12:12
2

When you're using a WCHAR*, you are invoking undefined behavior because you have a pointer but have not made it point to anything valid. You need to find out how long the resulting string will be and dynamically allocate space for the string. For example:

WCHAR* cs;
WCHAR* nd1;
WCHAR* nd2;

nd1 = new WCHAR[lstrlen(L"Some value") + 1]; // +1 for the null terminator
nd2 = new WCHAR[lstrlen(L"Another value") + 1];
cs = new WCHAR[lstrlen(L"The paths are  and ") + lstrlen(nd1) + lstrlen(nd2) + 1];

wcscpy(nd1, L"Some value");
wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder
swprintf( cs, L"The paths are %s and %s", nd1, nd2);

delete[] nd1;
delete[] nd2;
delete[] cs;

But this is very ugly and error-prone. As noted, you should be using std::wstring instead, something like this:

std::wstring cs;
std::wstring nd1;
std::wstring nd2;

nd1 = L"Some value";
nd2 = L"Another value";
cs = std::wstring(L"The paths are ") + nd1 + L" and " + nd2;
Dark Falcon
  • 43,592
  • 5
  • 83
  • 98
  • I've thought about "ugly-way", but because this is realy ugly the second way you've provided looks like El Dorado for me! Thanks! – Geradlus_RU Nov 13 '12 at 12:04
1

Suggest to use ATL CStringW class instead of raw WCHAR, it's much more handy. CString is wrapper for dynamically allocated C-string. It will manage string length & allocated memory buffer appropriately after each operation so you wouldn't care on it.

Typical usage:

#include <atlstr.h>

CStringW s;
s.Format(L"The paths are %s and %s", L"Some value", L"Another value");
const WCHAR* wstr = s.GetString(); // To pass to some API that need WCHAR

or

#include <atlstr.h>

CStringW s(L"The paths are ");
s += L"Some value";
s += L" and ";
s += L"Another value";
const WCHAR* wstr = s.GetString(); // To pass to some API that need WCHAR
Rost
  • 8,779
  • 28
  • 50
  • You can simply pass a `CString` instance to APIs requiring `const WCHAR*`, since `CString` offers a convenient implicit `const wchar_t*` operator for that purpose. – Mr.C64 Nov 13 '12 at 12:14
  • @Mr.C64 Yes, of course, but using explicit `GetString()` is preferable. – Rost Nov 13 '12 at 12:17
  • No, `CString::GetString()` makes sense only in contexts like `swprintf()` with a `printf`-like format string and `%s`. I'd just directly pass a `CString` instance as an argument to a function with a `LPCWSTR` parameter. – Mr.C64 Nov 13 '12 at 12:19
  • First, passing `CString` as `printf`-like func argument will work too because `CString` is binary compatible with `char*`/`wchar_t*`. Second, using implicit conversions is dangerous and confusing. Explicit conversions shall be used instead. That's because `std::string` has no such `const char*()` operator. – Rost Nov 13 '12 at 12:35
  • The fact that `CString` works with `printf()`-like functions is a kind of "hack"; it's not robust code. Even [MSDN discourages that use and suggests an explicit cast](http://msdn.microsoft.com/en-us/library/awkwbzyc(v=vs.100).aspx#_core_using_cstring_objects_with_variable_argument_functions) (but I find calling `str.GetString()` better than `static_cast(str)`). Moreover, passing `CString` to `const wchar_t*` parameters is _just fine_ (to me, `CString str; ... SetWindowText(hWnd, str);` is OK, but `SetWindowText(hWnd, str.GetString());` is _ugly_ code). – Mr.C64 Nov 13 '12 at 15:09
  • It's not a hack, `CString` is especially designed to support this. But of course it shall not be used this way. The same point for implicit cast vs `GetString()`. It's always better to explicitly express what are you doing, even if it's a bit more ugly. – Rost Nov 13 '12 at 16:24
  • @Mr.C64 Implicit conversions are evil, agree you or not. – Rost Nov 13 '12 at 17:41
  • I'm talking specifically about `CString` programming here. I disagree with you on both the direct usage of `CString` in `printf()`-like `%s` function arguments (I prefer `GetString()` here), and on using explicit calls to `GetString()` when passing `CString` instances to `const wchar_t*` Win32 API arguments (I prefer directly passing `CString` instances here). I'm going to stop that discussion; I think I expressed my position clearly. – Mr.C64 Nov 13 '12 at 17:56