0

I was using CHAR(code_point USING ucs2) to convert a unicode code point to utf-8 character but it's giving me unexpected results above 0x00ff code point. It gives me the the character Ā (code point 0x0100) against code points 0x0100 to 0x01FF, and character Ȁ (code point 0x0200) for code points 0x0200 to 0x02FF, and so on.

So if I execute this query:

SET NAMES utf8;
SELECT CHAR(0x0100 USING ucs2),CHAR(0x0101 USING ucs2),CHAR(0x0200 USING ucs2),CHAR(0x0201 USING ucs2);

, it gives me the result:

| Ā | Ā | Ȁ | Ȁ |

whereas the expected result is:

| Ā | ā | Ȁ | ȁ |

Please help me understanding the problem, or suggest another way of doing this.

Thanks in advance..

Adee
  • 464
  • 6
  • 17
  • To be exact, I am writing a user defined function in which I have to convert a SMALLINT to character, SMALLINT being the code point. – Adee Feb 12 '13 at 12:11

1 Answers1

1

I got it working by doing this

CONVERT(CHAR(code_point) USING ucs2);

I have to mix the characters with utf8, so I have to further convert into utf8

CONVERT(CONVERT(CHAR(code_point) USING ucs2) USING utf8);
Adee
  • 464
  • 6
  • 17