0

How can I save unicode characters in a Oracle database, configured in a non-unicode charset (this configurations is impossible to change)?

The text will be saved in VARCHAR2 columns (it can't be NVARCHAR2).

John Assymptoth
  • 8,227
  • 12
  • 49
  • 68

1 Answers1

2

There are a number of ways you can store arbitrary data (in this case UTF-encoded) in a VARCHAR2 field:

  • You could convert the Unicode characters to their U+XXXX form or their &#YYYY form. Only some of the characters would need a conversion.
  • You could store the binary data rawtohex-encoded.
  • You could also use standard functions such as UTL_ENCODE.mimeheader_encode:

    /* needs to be checked on a non-unicode db */ 
    SQL> declare
      2     l nvarchar2(200);
      3     p nvarchar2(200);
      4  begin
      5     l := UTL_ENCODE.MIMEHEADER_ENCODE (
      6        buf            => nchr(352),--'Š',
      7        encode_charset => 'UTF8',
      8        encoding       => UTL_ENCODE.QUOTED_PRINTABLE
      9     );
     10     dbms_output.put_line('encoded string: ' || l);
     11     p := utl_encode.mimeheader_decode (
     12        buf => l
     13     );
     14     dbms_output.put_line('decoded string: ' || p);
     15  end;
     16  /
    
    encoded string: =?UTF8?Q?=C5=A0?=
    decoded string: Š
    
    Statement processed.
    
Vincent Malgrat
  • 66,725
  • 9
  • 119
  • 171
  • I don't think `utl_encode.mimeheader_encode` would work if the database character set didn't support the character you were trying to encode. I would expect character set conversion to have converted the character to a replacement character that the database character set does support before the function could be called. – Justin Cave Sep 05 '12 at 14:47
  • @JustinCave: does this function work with an NVARCHAR2 in a non-unicode database (I don't have access to such a db right now)? – Vincent Malgrat Sep 05 '12 at 14:58
  • I don't believe that there is an `NVARCHAR2` overload. You could potentially pass in a character set with your `varchar2` but I'm not sure that would work from a client application-- I would expect character set conversion to convert the character set first. – Justin Cave Sep 05 '12 at 15:07