3

These codes sets the values to editbox. But I'm having trouble when I retrieve Unicode characters from MySQL table.

For example, string nüşabə sets as nüşabÉ™.

Here is my codes.

void CmysqlDlg::OnBnClickedButton3()
{
    USES_CONVERSION;

    try
    {
        mysql::MySQL_Driver *driver = new mysql::MySQL_Driver;
        Connection *dbConn;
        Statement *st;
        ResultSet *res;

        driver = mysql::get_mysql_driver_instance();
        dbConn = driver->connect("tcp://127.0.0.1:3306", "root", "connection");
        dbConn->setSchema("mfc_app_database");

        st = dbConn->createStatement();
        res = st->executeQuery("SELECT password FROM users WHERE id=1");
        string z;
        while (res->next())
        {
            //k = res->getString("username");
            //cs.Format(_T("%s"), k);
            //CString cs(k.c_str(), CP_UTF8);
            //combo.AddString(cs);
            //usernameData.SetWindowTextW(cs);

            z = res->getString("password");
            CString pass(z.c_str()/*, CP_UTF8*/);
            nameData.SetWindowTextW(pass);
        }


        delete res;
        delete st;
        delete dbConn;
        delete driver;
    }
    catch (exception e)
    {
        ofstream file("sadaasad.txt");
        file << e.what();
        file.close();
    }
}

Database collation is set to utf8_general_ci. Actually I don't know what I should do... Brain stopped...

Please help. Thanks.

Mirjalal
  • 1,292
  • 1
  • 17
  • 40
  • 1
    I hope you know you're [slicing the caught exception](http://coliru.stacked-crooked.com/a/ec42c461ba35b35b). – chris Mar 13 '15 at 19:25
  • @chris thanks for your quick response. But I couldn't understand what you mean. Could you explain please? – Mirjalal Mar 13 '15 at 19:28
  • Catch the exception as `catch (exception const & e)` . – Richard Critten Mar 13 '15 at 19:33
  • @RichardCritten I tried it. But there is no any exception. Program works without stopping working. – Mirjalal Mar 13 '15 at 19:35
  • @coni, It was just one problem with the code. Your method of catching exceptions won't respect the overriding done by any derived exception classes. – chris Mar 13 '15 at 19:46
  • @chris, I tried writing variables values to `.txt` file (`std::ofstream` for writing `std::string` and `CFile` for writing `CString` variable value). When I wrote `k` to the file, everything is OK. It wrote the expected value (`nüşabə`). But when writing the `CString pass` value characters change to `nüşabÉ™`. Actually I couldn't understand that why this happens? I you can please explain. Thanks. – Mirjalal Mar 13 '15 at 20:06
  • Couldn't this help ? http://stackoverflow.com/questions/5673194/conversion-of-utf-8-char-to-cstring – Christophe Mar 13 '15 at 20:57

1 Answers1

2

If you compile MFC for UNICODE, CString will be defined as a string of wchar_t using UTF16 encoding.

Constructing the CString directly from a char* like you do, works only if all chars are in the ASCII subset of UNICODE:

  • As soon as a unicode char is not ASCII, it will be encoded in UTF8 as several bytes, but the CString constructor then interprets this as two distinct chars.
  • This is the case for nüşabə with ü, ş, and ə, which each need 2 bytes in UTF8, and cause your CString to be 3 chars longer than expected.

So when you have an UTF8 encoded string in a char*, you need to convert it like explained in this SO answer, using MultiByteToWideChar().

Edit: Code example

Instead of

        CString pass(z.c_str());

You could write something like:

        wchar_t *p = new wchar_t[z.size()+1];  // UTF16 has same length or less thant UTF8 equivalent
        MultiByteToWideChar(
             CP_UTF8,         // CodePage,
             0,               // flags,
             z.c_str(),       // pointer to UTF8 string
             -1,              // -1 for null terminated string, size otherwise 
             p,               // destination buffer for converted wchar_t string 
             z.size()+1);        // size of buffer
        CString pass(p);
        delete p; 

Note that MultiByteToWideChar() and its reverse WideCharToMultiByte() belong to the Windows API and not to MFC.

Note that the standard C++ strings have standard conversion functions that are portable:

 wstring_convert<codecvt_utf8_utf16<wchar_t>, wchar_t> conversion;
 wstring s = conversion.from_bytes(z.c_str());
 string mbs = conversion.to_bytes(L"\u00c6\u0186"); 
Community
  • 1
  • 1
Christophe
  • 68,716
  • 7
  • 72
  • 138