2

I'm sure this question gets asked a lot but I just want to make sure there's not a better way to do this.

Basically, I have a const char* which points to a null-terminated C string. I have another function which expects a const wchar_t* pointing to a string with the same characters.

For the time being, I have been trying to do it like this:

    size_t newsize = strlen(myCString) + 1;
    wchar_t * wcstring = new wchar_t[newsize];
    size_t convertedChars = 0;

    mbstowcs_s(&convertedChars, wcstring, newsize, myCString, _TRUNCATE);

    delete[] wcstring;

I need to make these conversions in a lot of places since I'm dealing with 3rd party libraries which expect one or the other. Is this the recommended way to go about this?

user974967
  • 2,928
  • 10
  • 28
  • 45
  • 4
    It strongly depends upon the supposed encoding of the `const char*` string (which could be UTF-8, ISO-Latin-1, KOI-8, ...). You have to know that encoding. You could be interested by http://www.gnu.org/software/libiconv/ library. – Basile Starynkevitch Oct 18 '12 at 19:10
  • You're essentially on the right track. You can get the correct target size by running `mbstowcs` with a null pointer first. Don't forget to run `std::setlocale("");` at the start of your program. – Kerrek SB Oct 18 '12 at 19:15

2 Answers2

2

What you're doing is pretty much the recommended way of doing it, assuming that your data is all ASCII. If you have non-ASCII data in there, you need to know what its encoding is: UTF-8, Windows-1252, any of the ISO 8859 variants, SHIFT-JIS, etc. Each one needs to be converted in a different way.

The only thing I would change would be to use mbstowcs instead of mbstowcs_s. mbstowcs_s is only available on Windows, while mbstowcs is a standard C99 function which is portable. Of course, if you'd like to avoid the CRT deprecation warnings with the Microsoft compiler without completely turning them off, it's perfectly fine to use a macro of #if test to use mbstowcs on non-Windows systems and mbstowcs_s on Windows systems.

You can also use mbstowcs to get the length of the converted string by first passing in NULL for the destination. That way, you can avoid truncation no matter how long the input string is; however, it does involve converting the string twice.

For non-ASCII conversions, I recommend using libiconv.

dda
  • 6,030
  • 2
  • 25
  • 34
Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • 1
    Actually it doesn't convert from ASCII -- it converts from whatever format is implied by LC_CTYPE in your current locale, which might be ASCII or UTF-8 or something else. – Chris Dodd Oct 18 '12 at 19:47
  • You might also want to look at mbsrtowcs(3) which is thread safe if you pass a state variable into it. It has a "safer" interface and is available on Windows and Linux. – rholmes Nov 10 '14 at 23:17
  • @ChrisDodd how do i make it convert ASCII to wide regardless of the locale, without setting the locale of course – demberto Aug 19 '22 at 12:54
0

You haven't said what encodings are involved. If you have non-multibyte strings, you can just use this:

std::string a("hello");
std::wstring b(s.begin(), s.end());

const wchar_t *wcString= b.c_str();
Roddy
  • 66,617
  • 42
  • 165
  • 277