4

How do I convert a string in UTF-8 char* to CString?

user541686
  • 205,094
  • 128
  • 528
  • 886
Greenhorn
  • 658
  • 2
  • 8
  • 24

3 Answers3

5
bool Utf8ToCString( CString& cstr, const char* utf8Str )
{
    size_t utf8StrLen = strlen(utf8Str);

    if( utf8StrLen == 0 )
    {
        cstr.Empty();
        return true;
    }

    LPTSTR* ptr = cstr.GetBuffer(utf8StrLen+1);

#ifdef UNICODE
    // CString is UNICODE string so we decode
    int newLen = MultiByteToWideChar(
                     CP_UTF8,  0,
                     utf8Str, utf8StrLen,  ptr, utf8StrLen+1
                     );
    if( !newLen )
    {
        cstr.ReleaseBuffer(0);
        return false;
    }
#else
    WCHAR* buf = (WCHAR*)malloc(utf8StrLen);

    if( buf == NULL )
    {
        cstr.ReleaseBuffer(0);
        return false;
    }

    int newLen = MultiByteToWideChar(
                     CP_UTF8,  0,
                     utf8Str, utf8StrLen,  buf, utf8StrLen
                     );
    if( !newLen )
    {
        free(buf);
        cstr.ReleaseBuffer(0);
        return false;
    }

    assert( newLen < utf8StrLen );
    newLen = WideCharToMultiByte(
                     CP_ACP,  0,
                     buf, newLen,  ptr, utf8StrLen
                     );
    if( !newLen )
    {
        free(buf);
        cstr.ReleaseBuffer(0);
        return false;
    }

    free(buf);
#endif

    cstr.ReleaseBuffer(newLen);
    return true;
}

Though this function is valid for both UNICODE and non-UNICODE configurations IMHO using UNICODE configuration in Win32 programs is much more productive (in general and in this function).

Serge Dundich
  • 4,221
  • 2
  • 21
  • 16
4

Call MultiByteToWideChar with a code page of CP_UTF8, then use CString as normal.

ildjarn
  • 62,044
  • 9
  • 127
  • 211
  • I want it to be converted to char* not wchar_t *.. Is there any method? – Greenhorn Apr 15 '11 at 06:49
  • 1
    @Athreya : Why on earth would you want that? That conversion is bound to be lossy -- if the string was Unicode to begin with, what makes you think it contains exclusively ANSI characters? – ildjarn Apr 15 '11 at 06:53
  • I need to parse and execute the statement using OCI library which only accepts char* as input – Greenhorn Apr 15 '11 at 06:54
  • 3
    @Athreya : Isn't `char*` what you already have? In any case, I'm not aware of any "MultiByteToMultiByte" sort of function, so I think you'll have to round-trip it -- call `MultiByteToWideChar` with `CP_UTF8` then `WideCharToMultiByte` with whatever code page you want the resulting `char*` in. – ildjarn Apr 15 '11 at 06:56
0

If your string contains only ASCII-characters with codes 0 to 127 you may threat your UTF-8 string as ASCII string and initialise CString with it:

CString my_cstr((char*)my_string);

Otherwise (if your UTF-8 string contains some other characters) you have no easy way to get char* string from it.

Jurlie
  • 1,014
  • 10
  • 27