I am attempting to store Unicode characters in UTF8 format on a DB2 database. I have confirmed that the charset is 1208 and the that the database is specified to hold UTF8.
I am, however, getting odd results when querying some unicode data.
select hex(firstname), firstname, from my_schema.my_table where my_pk = 1234;
The results are as below:
C383C289 Ã
The character in the result is displaying wrong. From what I gather, it's being represented by the hex values "C383C289". The actual character sent on the insert was É and should be represented in UTF8 as C389.
At this stage I'm assuming that it could be the program that I am using to query the data that is interpreting it wrong. But to what extent are the hex values (first result column) wrong? They seem to have unused fluff "83C2" between the actual bytes. Or, is "C383C289" actually correct, and some UTF8 decoding engines can't handle the fluff? This seems unlikely to me.
The client (DB2 For Toad, and WinSQL) both display the character as an à which is represented in UTF8 as C383.
*Edit. I tested on the CLI and it is correctly returning the É character. Am I missing something? Is the "hex" function returning something that it shouldn't be?