1

I have a UNICODE string (chinese) which I want to convert back to MBCS so as to add it as a parameter to an SQL query. (the column in SQL Server in varchar and so this conversion is necessary for me).

How can I convert to MBCS in c#? Please help.

Thanks, Praseo

praseo
  • 21
  • 2
  • 7
  • "MBCS" as in "multi-byte character set"? Forgive me if I'm missing something specific to SQL Server, but UTF-8 (assuming "Unicode string" means UTF-8) *is* a (VL)MBCS. What other MBCS do you want to convert it to? – deceze Oct 17 '11 at 04:06
  • 1
    What MBCS do you need? And what form do you need it in? As a `byte[]`? Doesn't SQL server take care of the conversion? – svick Oct 17 '11 at 07:08
  • 1
    I didn't think SQL Server supported Unicode in a varchar column. (You're supposed to use nvarchar, and even then, it's UCS-2.) The varchar columns use a "code page" which varies, and depends on the way the table was set up. Essentially the same problem as "ANSI" and "Unicode" from Win32, but in database land. – Thanatos Oct 17 '11 at 07:33

3 Answers3

2

“MBCS” could be a number of encodings. For the China locale, it would be code page 936, a GB-variant, and you could encode to its bytes using:

byte[] bytes= Encoding.GetEncoding(936).GetBytes("你好")
=>
{196, 227, 186, 195}

If you aren't specifically talking about GB encoding and want whatever multibyte encoding is the default for your current system, then you can just use the Default (“ANSI”) encoding which will be that MBCS encoding:

byte[] bytes= Encoding.Default.GetBytes("你好")
=>
{196, 227, 186, 195}  // on a system in the China locale (cp936)
{167, 65, 166, 110}   // on a system in the HK locale (cp950)

Now to get your byte array into the query you'll have to either:

1) best, use a parameter with a byte-based SqlDbType, eg:

command.Parameters.AddWithValue("@greeting", bytes);

2) orif you have to insert it directly into the query string, encode the bytes as a hex literal, eg:

"0x"+BitConverter.ToString(bytes).Replace("-", "")
=>
... WHERE greeting=0xC4E3BAC3 ...

(this is non-ANSI-standard SQL Server syntax)

bobince
  • 528,062
  • 107
  • 651
  • 834
  • Trying this out. Facing problems for other non varchar parameters. Results in the SqlDataReader is getting stored in MBCS for non varchar properties also. Something wrong somewhere. :( – praseo Oct 21 '11 at 09:40
0

Thanks bobince for the hint. I could not parameterize my query execution though.

Finally, one of my senior colleagues helped. :) Here is another solution to convert a Unicode encoded string to one consisting of Multibyte characters(MBCS) for comparison in T-SQL against varchar columns in legacy database.

We can make use of the method GetNonUnicodeBytes() of the class SqlString.

Here is a code snippet below.

        string inputUnicodeString = "你好";
        SqlString mySqlString = new SqlString(inputUnicodeString);
        byte[] mbcsBytes = mySqlString.GetNonUnicodeBytes();
        string outputMbcsString = string.Empty;
        for (int index = 0; index < mbcsBytes.Length; index++)
        {
            outputMbcsString += Convert.ToChar(mbcsBytes[index]);
        }

This helped me compare the required varchar database column with user requested string in UNICODE. Hope this reply helps others.

Regards,

Praseo

praseo
  • 21
  • 2
  • 7
-1

I'm not sure whether you want this:

string result = Encoding.GetEncoding("gb2312").GetString(Encoding.Unicode.GetBytes(yourStr));
ojlovecd
  • 4,812
  • 1
  • 20
  • 22
  • So you convert a string into its UTF-16 representation and then try to decode that as a gb2312 string? That doesn't make any sense. – svick Oct 17 '11 at 07:07