3

I have imported with Perl a table from our database AS/400 DB2.

The problem is that the string are encoded in EBCDIC Latin-1 (italian language).

How can I convert the resulting file to plain utf-8 in Linux bash?

luca76
  • 813
  • 1
  • 10
  • 20

3 Answers3

5

Start with

iconv -f EBCDIC-IT -t utf-8 <filename>

then check the output, and if it isn't exactly correct, check man iconv and the available encodings listed by iconv -l.

(Note that "EBCDIC Latin-1" is somewhat strange. "Latin-1" indicates ISO-8859-1, while "EBCDIC" is something else entirely. Try file <filename> to get an educated guess by the computer as to what encoding you are actually looking at.)

DevSolar
  • 67,862
  • 21
  • 134
  • 209
2

I had good luck with the following line:

iconv -f IBM037 -t utf-8 input_ebcdic.txt -o output.txt
JayBee
  • 49
  • 1
  • 4
0

It's simple with iconv.

iconv -f ISO8859-1   -t "UTF-8" result.csv -o new_result.csv

ISO8859-1 is the Latin-1 encoding format. For a list of encodings, refer t this table from official IBM documentation: https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.nls/doc/nlsgdrf/iconv.htm%23d722e3a267mela

Note that the conversion may leave non valid UTF-8 characters from EBCDIC. An example are NULL characters in the strings. To avoid this, use an HEX editor and replace hex values from 00 to 20 (space character).

luca76
  • 813
  • 1
  • 10
  • 20