How do I convert a string in UTF-8 char* to CString?
Asked
Active
Viewed 1.5k times
3 Answers
5
bool Utf8ToCString( CString& cstr, const char* utf8Str )
{
size_t utf8StrLen = strlen(utf8Str);
if( utf8StrLen == 0 )
{
cstr.Empty();
return true;
}
LPTSTR* ptr = cstr.GetBuffer(utf8StrLen+1);
#ifdef UNICODE
// CString is UNICODE string so we decode
int newLen = MultiByteToWideChar(
CP_UTF8, 0,
utf8Str, utf8StrLen, ptr, utf8StrLen+1
);
if( !newLen )
{
cstr.ReleaseBuffer(0);
return false;
}
#else
WCHAR* buf = (WCHAR*)malloc(utf8StrLen);
if( buf == NULL )
{
cstr.ReleaseBuffer(0);
return false;
}
int newLen = MultiByteToWideChar(
CP_UTF8, 0,
utf8Str, utf8StrLen, buf, utf8StrLen
);
if( !newLen )
{
free(buf);
cstr.ReleaseBuffer(0);
return false;
}
assert( newLen < utf8StrLen );
newLen = WideCharToMultiByte(
CP_ACP, 0,
buf, newLen, ptr, utf8StrLen
);
if( !newLen )
{
free(buf);
cstr.ReleaseBuffer(0);
return false;
}
free(buf);
#endif
cstr.ReleaseBuffer(newLen);
return true;
}
Though this function is valid for both UNICODE and non-UNICODE configurations IMHO using UNICODE configuration in Win32 programs is much more productive (in general and in this function).

Serge Dundich
- 4,221
- 2
- 21
- 16
-
this is not a solution. The solution is to figure out the target single-byte code page and to convert UTF-8 string to single-byte string of that CP. – Jurlie Apr 15 '11 at 07:13
-
@Jurlie: read the comment before memcpy. Though may be I'll post implementation now. – Serge Dundich Apr 15 '11 at 07:16
-
Edited the post to include UTF8 to current 8-bit code page conversion. – Serge Dundich Apr 15 '11 at 07:28
4
Call MultiByteToWideChar
with a code page of CP_UTF8
, then use CString as normal.

ildjarn
- 62,044
- 9
- 127
- 211
-
I want it to be converted to char* not wchar_t *.. Is there any method? – Greenhorn Apr 15 '11 at 06:49
-
1@Athreya : Why on earth would you want that? That conversion is bound to be lossy -- if the string was Unicode to begin with, what makes you think it contains exclusively ANSI characters? – ildjarn Apr 15 '11 at 06:53
-
I need to parse and execute the statement using OCI library which only accepts char* as input – Greenhorn Apr 15 '11 at 06:54
-
3@Athreya : Isn't `char*` what you already have? In any case, I'm not aware of any "MultiByteToMultiByte" sort of function, so I think you'll have to round-trip it -- call `MultiByteToWideChar` with `CP_UTF8` then `WideCharToMultiByte` with whatever code page you want the resulting `char*` in. – ildjarn Apr 15 '11 at 06:56
0
If your string contains only ASCII-characters with codes 0 to 127 you may threat your UTF-8 string as ASCII string and initialise CString with it:
CString my_cstr((char*)my_string);
Otherwise (if your UTF-8 string contains some other characters) you have no easy way to get char* string from it.

Jurlie
- 1,014
- 10
- 27
-
-
@Athreya what is codepage you want to convert your string to? Or what is the language at least? Are you sure your UTF-8 string could be represented as single-byte string at all? – Jurlie Apr 15 '11 at 07:15
-
@Athreya : Jurlie means *losslessly* represented as a single-byte string. And your first comment on this answer indicates that this isn't the case. – ildjarn Apr 15 '11 at 07:23
-
I meant the language of string to convert, not a programming language :-D – Jurlie Apr 15 '11 at 07:24
-