Using an Oracle database and using utl_file to output Welsh characters

Question

Can some one help here?

Our database character set is NLS_CHARACTERSET CEL8ISO8859P14.

I trying to output using utl_file.put_line at street name as follows

B4366 O DAFARN GORS BACH I GYLCHFAN GROESLON TŶ MAWR

That is what is held on the database.

both utl_file.put_line & TL_FILE.PUT_LINE_NCHAR output

B4366 O DAFARN GORS BACH I GYLCHFAN GROESLON TÞ MAWR

Note the difference.

I have tried opening the output file using both UTL_FILE.FOPEN_NCHAR and UTL_FILE.FOPEN

Has anyone got any idea as what I am doing wrong?

How are you determining what UTL_FILE is outputting? Does the environment you're viewing the file in have the same characterset as your database? — Ben, Sep 15 '16 at 12:20
Thanks for the reply. We view the output using excel or notepad++. When dumped on a byte-by-byte basis you can see that the “Ŷ” character occupies one byte, with an octal value of 336 (that’s 0xDE in hexadecimal and 222 in decimal). That is the character position in ISO 8859-14. $ od -c ISO-8859-14.txt 0000000 1 5 , " I " , 4 1 4 3 3 4 , 4 6 0000020 4 0 0 1 5 2 , " B 4 3 6 6 O 0000040 D A F A R N G O R S B A C H 0000060 I G Y L C H F A N G R O E 0000100 S L O N T 336 M A W R " , " " 0000120 , " L L A N D D E I N I O L E N 0000140 " , " G W Y N E D D " , " C Y M 0000162 — Jonathan, Sep 16 '16 at 14:21
By contrast, when the file is properly encode in UTF-8, the character is represented by two byes – (305, 266) in octal. That’s (C5, B6) in hex which is how Unicode character U0174 (Ŷ) is represented. $ od -c UTF-8.txt 0000000 1 5 , " I " , 4 1 4 3 3 4 , 4 6 0000020 4 0 0 1 5 2 , " B 4 3 6 6 O 0000040 D A F A R N G O R S B A C H 0000060 I G Y L C H F A N G R O E 0000100 S L O N T 305 266 M A W R " , " 0000120 " , " L L A N D D E I N I O L E 0000140 N " , " G W Y N E D D " , " C Y 0000160 M " \r \n 0000164 — Jonathan, Sep 16 '16 at 14:25

0 Answers0